Mailing List Archive

Fwd: [JuliaLang] Pkg downtime incident
And this is why one should't depend on external services for critical stuff.
Beautiful example.

---------- Weitergeleitete Nachricht ----------

Betreff: [JuliaLang] Pkg downtime incident
Datum: Mittwoch, 5. August 2020, 01:08:13 EEST
Von: Keno Fischer via JuliaLang <julialang@discoursemail.com>
An: huettel@gmail.com




Earlier today, several users started seeing issues installing packages. This
post seeks to collect all the information related to this incident.

# Impact

The issue caused installation of incorrect versions (latest master when a
prior version was requested) of packages.
- Versions of Julia prior to 1.4 will silently install the wrong version
- Windows versions of Julia 1.4.x will also silently install the wrong version
- Non-Windows versions of Julia 1.4.x will issue a warning and fall back to
git to obtain the correct version
- Julia 1.5 is unaffected when using the pkg server (which is the default),
otherwise matches 1.4 behavior

The issue has since been mitigated in the registry, so you were only affected
if you were attempting package operations on an affected version between
approximately 2pm Eastern and 3:43pm Eastern when the mitigation went into
effect.

# Symptoms

Installing the wrong version of a Julia package can cause incorrect behavior
in several different ways. Perhaps the most common will be inscrutable package
dependency errors, but more subtle behaviors are possible. If you performed a
package operation today, you may want to see the mitigation section below as a
precaution.

# Mitigation

If an incorrect package version was installed, it will be locally cached until
removed. As such, if you believe you were affected, it is advisable to clear
your package cache by deleting `.julia/packages`. Note that your list of
installed packages will not be affected and you may re-download all installed
packages in your current environment by using `Pkg.instantiate()`.

# Root cause

The root cause of this change was an unannounced serverside change by GitHub,
which broke download of tarballs by git-tree-hash, e.g. previously https://
api.github.com/repos/JuliaLang/MbedTLS.jl/tarball/
2d94286a9c2f52c63a16146bb86fd6cdfbf677c6 would give the tarball for that tree-
hash, while it now gives the tarball for master instead. We do not yet know
whether this change was intentional or not. The reason this change broke Pkg
is that Pkg includes a heuristic where it will use the tarball download
feature instead of a full git checkout as faster way to download a requested
version (since it no longer needs to download the full repository with all its
history). This was special cased for github.com and does not affect packages
hosted elsewhere (though the vast majority of packages are currently hosted on
GitHub).

# Registry workaround

The above mentioned workaround was https://github.com/JuliaRegistries/General/
pull/18991/files, which changes the URL for all registered packages from
`github.com` to `GitHub.com`. This breaks above mentioned heuristic and will
force older versions of Julia to fall back to a full git checkout instead.
This method is slower, but should yield the correct package version. Note that
Julia 1.5+ is unaffected and downloads via the Pkg server will continue to be
fast.

# Additional considerations/General registry updates paused

We have contacted GitHub to find out whether this change was intentional and
is likely to persist. If so, we will need to update Registrator and the
validation CI to force packages registered at GitHub to use the same
`GitHub.com` workaround we manually applied to the registry. If not, the
workaround will be reverted as soon as GitHub restores the original behavior
(to get back to faster package download speeds on older versions). In the
meantime changes (new packages/version bumps) to the General registry are
paused. They will be resumed once either of the two options have been
completed.

# Future considerations

As noted, Julia versions 1.5+ are not affected due to the Pkg server work
(which was partly motivated by a desire to avoid incidents like this once).
However, such Julia versions will still fall back to raw GitHub downloads if
the package server is unavailable for some reason (broken, blocked by
corporate firewall, we forgot to pay our bills, etc.). In the near future, the
validation currently present on non-Windows versions, will be extended to
Windows version, such that even with a broken package server, the fall back
path would itself fallback to Git if it is being served incorrect tarballs
(the same verification will of course extend to the package server also). This
change has been planned for some time and the requisite support is already
available in Tar.jl, but has not yet been wired up in Pkg.





---
[Visit Topic](https://discourse.julialang.org/t/pkg-downtime-incident/44288/1)
or reply to this email to respond.

To unsubscribe from these emails, [click here](https://
discourse.julialang.org/email/unsubscribe/
5dcc8fe0a5dab8380516e5d33481407163067880c0e37e4db5d9c1772dabf1d2).

-------------------------------------------------------------
--
Andreas K. H?ttel
dilfridge@gentoo.org
Gentoo Linux developer
(council, qa, toolchain, base-system, perl, libreoffice)