mem_cgroup_reparent_charges() could get stuck while holding cgroup_mutex and make the whole system hang.
kvm: inefficient memory shrinking for VMs.
It was discovered that a node with dozens of CPU cores, lots of RAM and many VMs running could get into a situation when almost all CPU cores were busy in mmu_shrink_scan(). This could happen because memory shrinking was done under kvm_lock spinlock and only for one VM at a time. All CPU cores but one just waited for kvm_lock in such cases, while the last one was busy with the actual memory shrinking for a VM.
fuse_kio_pcs: latency was calculated incorrectly.
It was found that the in-kernel implementation of Virtuozzo Storage client stored latency values in milliseconds rather than in microseconds, resulting in bogus statistics data.
Processes could hang while closing a file located on the storage cluster.
OOM killer would kill tasks from cgroups without memory guarantees first.
If the amount of free memory is low, OOM killer would kill the tasks from cgroups without memory guarantees first. However, it seems more reasonable to kill the tasks from cgroups exceeding their guarantees the most.
virtio_scsi: a race condition in the Linux block layer could cause certain I/O requests to hang.
ploop: kernel crash in ploop_congested().
ext4: inode tables created during online resize were not zeroed.
It was discovered that inode tables created during online resize of an ext4 filesystem were not zeroed after that. This could potentially result in lower performance of the filesystem.
Windows Server 2016 Essentials failed to install into a QEMU VM with disabled PMU.
It was found that if no PMU counters were exposed to guest, KVM skipped the whole remaining PMU-related initialization, including filling of LBR-related data. As it turned out, Windows Server 2016 Essentials tried to access these data during the installation and failed to install as a result.
ploop: 'pcompact' could hang if run simultaneously with 'ploop-balloon status'
Memory leak in the implementation of IPv4 routing.
It was discovered that a certain sequence of operations related to IPv4 routing could trigger a kernel memory leak. An attacker could potentially exploit that from a container to cause a denial of service.
KVM: potential use-after-free via kvm_ioctl_create_device().
A use-after-free vulnerability was found in the way KVM implements its device control API. When a device is created via kvm_ioctl_create_device(), it holds a reference to a VM object. This reference is transferred to file descriptor table of the caller. If such file descriptor was closed, reference count to the VM object could become zero, which could lead to a use-after-free issue. A user/process could use this flaw to crash the guest VM resulting in a denial of service or, potentially, gain privileged access to a system.
KVM: use-after-free in the emulation of the preemption timer for the L2 guest systems.
A use-after-free vulnerability was found in the way KVM emulates a preemption timer for L2 guests when nested virtualization is enabled. A guest user/process could use this flaw to crash the host kernel resulting in a denial of service or, potentially, gain privileged access to a system.
I/O errors were reported after a successful replacement of the ploop images.
'ploop replace' did not clear 'abort' flag.
It was found that if a ploop image was revoked and then replaced using 'ploop replace', 'abort' flag was not cleared. As a result, subsequent I/O operations would fail.
ploop: potential data corruption due to a race between 'prepare_merge' and 'submit_alloc' operations.
vzstat shows incorrect per-CT scheduling latency (MLAT).
High order page allocations were made in neigh_probe() in certain cases.
High order page allocations were triggered by CRIU while restoring TCP sockets.
Network performance issues due to the usage of pfmemalloc reserves.
It was discovered that network drivers could allocate memory for the socket buffers from pfmemalloc memory reserves, even when it was unnecessary. As a result, the network packets were dropped by sk_filter_trim_cap() causing performance issues.
fuse_kio_pcs: kernel crash in process_pcs_init_reply() caused by a double free.
fuse_kio_pcs: kernel crash in kpcs_kill_requests().
skb drops due to the usage of pfmemalloc reserves were difficult to debug.
Additional diagnostics was introduced to make it easier to detect and analyze skb drops due to the usage of pfmemalloc reserves.
KVM did not update CPUID bits OSXSAVE and OSPKE in some cases.
It was discovered that CPUID bits OSXSAVE and OSPKE were not updated properly by KVM when the guest system rebooted. As a result, the guest system could crash.
The per-container limit on the network interfaces was too low for Docker in some cases.
It was discovered that Docker running inside a Virtuozzo container could hit the limit on the network interfaces (256) when it tried to start 50+ its containers. This fix allows changing that limit for the running containers and increases the default limit to 1024.
txqueuelen could not be changed via SIOCSIFTXQLEN ioctl on the host.
Kernel crash in ext4_clear_inode().
A large tarball with a lot of small files can fail to unpack inside a container if kmem limit is set.
It was found that unpacking a large tarball with a lot of small files could fail inside a container. This could happen because kmem limit was hit prematurely, while reclaimable memory was still available.
sr_mod: kernel crash in sr_block_revalidate_disk().