Release announcements, Security

Zulip Server 4.4 security release

Alex Vandiver 5 min read

We released Zulip Server 4.4 today! This is a security release, containing important security fixes, as well as important cherry-picked bug fixes, since Zulip Server 4.3.

What’s new

This release fixes the following issues:

  • Added a tool to fix potential database corruption caused by host OS upgrades; see below.
  • Fixed a possible denial-of-service attack in Markdown fenced code block parsing.
  • Smokescreen, if installed, now defaults to only listening on 127.0.0.1; this prevents it from being used as an open HTTP proxy if it did not have other firewalls protecting incoming port 4750.
  • Fixed a performance/scalability issue for installations using the S3 file uploads backend.
  • Fixed a bug where users could turn other users’ messages they could read into widgets (e.g. polls).
  • Fixed a bug where emoji and avatar image requests were sent through Camo; doing so does not add any security benefit, and broke custom emoji that had been imported from Slack in Zulip 1.8.1 or earlier.
  • Changed to log just a warning, instead of an exception, in the case that the embed_links worker cannot fetch previews for all links in a message within the 30-second timeout. Each preview request within a message already has a 15-second timeout.
  • Ensured psycopg2 is installed before starting process_fts_updates; otherwise, it might fail to start several times before the package was installed.
  • Worked around a bug in supervisor where, when using SysV init, /etc/init.d/supervisor restart would only have stopped, not restarted, the process.
  • Modified upgrade scripts to better handle failure, and suggest next steps and point to logs.
  • Zulip now hides the “show password” eye icon that IE and Edge browsers place in password inputs; this duplicated the already-present Javascript-based functionality.
  • Fixed “OR” glitch on login page if SAML authentication is enabled but not configured.
  • The send_test_email management command now shows the full SMTP conversation on failure.
  • Provided a change_password management command which takes a --realm option.
  • Fixed upgrade-zulip-from-git crashing in CSS source map generation on 1-CPU systems.
  • Added an auto_signup field in SAML configuration to auto-create accounts upon first login attempt by users which are authenticated by SAML.
  • Provided better error messages when puppet_classes in zulip.conf are mistakenly space-separated instead of comma-separated.
  • Updated translations for many languages.

Fixing potential database corruption caused by host OS upgrades

We have also identified a possible source of database corruption, which affects users which have upgraded their database hosts, at any point in the past, from Ubuntu 18.04 to Ubuntu 20.04, or from Debian Stretch to Debian Buster. Specifically, those upgrades upgrade the major version of the low-level glibc library, which affects how PostgreSQL orders text data (a.k.a. “collations”). This corrupts databases indexes. This, in turn, can result in duplicates being generated for objects where Zulip has configured the database to enforce uniqueness, because PostgreSQL uses database indexes to implement that feature.

The corruption caused by this issue is generally rare, repairable, and often invisible to users. But it can result in users getting 500 errors when trying to access Zulip, so we recommend that all installations take the steps described here to fix it. We’ve also updated the documentation for upgrading the Zulip host system OS to detail running a reindexing tool before starting the Zulip server back up, which will prevent any new servers from being affected by this problem.
Zulip servers that have previously upgraded the OS on their PostgreSQL host (typically, the Zulip server) will need to repair their database. This release adds a tool to do this repair by regenerating all affected indexes:

/home/zulip/deployments/current/scripts/setup/reindex-textual-data --force

Users on PostgreSQL 11 or higher can pass --concurrently to run this without taking table-wide write locks, at some performance cost. Without --concurrently, this will take write locks which will block normal use of the Zulip server; we suggest that you stop the service first, with scripts/stop-server.

It is safe to run this tool even if you’re not sure whether your system is affected.

The tool will regenerate all indexes that have not already caused duplicate rows. If any duplicate rows have already been created in violation of a given database index, the tool will report the affected index and sample duplicate rows. The duplicate rows must be manually repaired before running the tool again to regenerate those indexes. (The database cannot enforce a uniqueness constraint when there are already rows violating that constraint in the database.)

One can manually repair the duplicates by carefully adjusting foreign keys to point to the original object and then deleting the duplicate objects; it is best to do this after stopping the server. Support for doing these manual repairs is available in this chat.zulip.org thread.

We always recommend taking a backup before making any manual changes to the Zulip database.

Edit: This tool was accidentally omitted from the 4.4 release, so you need to upgrade to 4.5 to get it. (We recommend always upgrading to the latest minor release in a series).

Upgrading

We recommend that all installations upgrade to this new release. See the upgrade instructions in the Zulip documentation. If you need help, best-effort support is available on chat.zulip.org.

Community

We love feedback from the Zulip user community. Here are a few ways you can connect: