Fix for Intel i915 GPU Freeze on Recent Linux Kernels

Recently Intel started including their graphics drivers into the mainline Linux kernel.  This is great except when it stops working.  Having suffered intermittent, sporadic GPU freezes on my Lenovo x270 (Kabylake) work laptop since Kernel 4.12+ I came across a bug report that seemed related.  Here’s my temporary fix on Fedora 28 for getting things stable again until it’s fixed for good upstream.

 

The Intel Integrated Graphics Crash
I started having complete system freezes intermittently where my laptop display would shudder / jitter and then hard lock.  There was no real pattern to this happening nor was I able to get any log file information or journal information about the issue as it completely froze.  No network, no ping, nothing.

VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)

I’ve had this issue on both Skylake and Kabylake Intel-based laptop systems with integrated HD graphics (i915 driver) across Kernels 4.12+ through the 4.16.8 Fedora 28 Kernel.  I recall this also happened on my previous Lenovo x240 though less frequently.

I am sure at some point this will be permanently fixed and disseminate down to all the major Linux distributions.  I was impatient and frustrated and wanted something to work right away, here’s how I got there.

Unrelated – Micro Freezes on Linux 5.x Kernels and i915

Update 2019-09-09:  The original issue I will walk through in this blog post is not related to recent micro-freezes on Linux 5.x kernels and i915 integrated GPU’s.  There is a proposed patch pending for this here and I’ve been testing out the following kernel option to work around this right now

i915.enable_psr=0

Panel Self Refresh (PSR), a power saving feature used by Intel iGPUs is known to cause flickering in some instances FS#49628 FS#49371 FS#50605. A temporary solution is to disable this feature using the kernel parameter i915.enable_psr=0.

Update: 2018-05-28:  It seems that as of Skylake Fedora no longer uses the intel/i915 driver by default if you’re using Xorg.  It instead uses the xorg-x11-drv-intel driver which means the testing below is not relevant currently (thanks to Venemo in the comments).

I doing more testing now against the stock 4.16.10 kernel using the Intel driver in lieu of the xorg-x11-drv-intel driver.

To switch to the intel driver make yourself an xorg shim config and then restart Xorg or reboot.

cat > /etc/X11/xorg.conf.d/10-intel.conf <<EOF
Section "Device"
Identifier "Intel Graphics"
Driver "intel"
EndSection
EOF

If you were running the xorg Intel drivers you’d have seen something like this in /var/log/Xorg.0.log:

[ 18.547] X.Org Video Driver: 23.0
[ 18.547] X.Org XInput driver : 24.1
[ 18.547] X.Org Server Extension : 10.0

If you are now running the Intel i915 driver you’d see this instead:

[ 17.369] (II) intel: Driver for Intel(R) Integrated Graphics Chipsets:
i810, i810-dc100, i810e, i815, i830M, 845G, 854, 852GM/855GM, 865G,
915G, E7221 (i915), 915GM, 945G, 945GM, 945GME, Pineview GM,
Pineview G, 965G, G35, 965Q, 946GZ, 965GM, 965GME/GLE, G33, Q35, Q33,
GM45, 4 Series, G45/G43, Q45/Q43, G41, B43
[ 17.369] (II) intel: Driver for Intel(R) HD Graphics
[ 17.369] (II) intel: Driver for Intel(R) Iris(TM) Graphics
[ 17.369] (II) intel: Driver for Intel(R) Iris(TM) Pro Graphics
[ 17.370] (II) intel(0): Using Kernel Mode Setting driver: i915, version 1.6.0 20171222
[ 17.372] (--) intel(0): Integrated Graphics Chipset: Intel(R) HD Graphics 620

Update: 2018-05-29: I’m still getting GPU freezes with the 4.16.11 Fedora 28 kernel and the Intel i915 driver.

I have also tried the following kernel parameter which doesn’t help:

i915.enable_dc=0

I am now testing the latest drm-tip against the Fedora Rawhide 4.17.0-rc kernels as I have time to hopefully see if/when a fix appears.

Update: 2018-06-04: So far things have been stable for 4 days on 4.17.0-rc7 and latest Intel drm-tip kernel tree modules copied in, I will keep this updated if I have another freeze.

Commenter Venemo has stated he’s still getting this freeze on 4.17.0-rc7 however on another system.

Update: 2018-06-05:  I experienced another GPU freeze but it took about ~5 days of normal usage and dozens of suspend/resumes.  For giggles I tested trying to trigger this in the BIOS and was able to get screen artifacts by jerking the laptop around and also by squeezing (a normal amount) the palm rest area of the laptop.

I also tried a Windows 10 USB stateless image and indeed I get the same GPU freezes there too.  This makes be believe it’s a hardware defect.  I’ve filed a ticket with Lenovo and I’ll be mailing my laptop in for repair/replacement – I’ll let ya know how it goes.

Update: 2018-07-17: I received my laptop back from repair and the motherboard and SATA cables were replaced.  I’ve had zero issues for almost a month now on 4.17.7+ kernel on Fedora28.  Here’s the work receipt from Lenovo (replaced assemblies PCB, motherboard, cables, wire):

I am leaving the rest of this blog post / guide up in case it might be useful for someone tracing down Intel GPU issues on Linux or filing bugs against upstream.

Testing Intel Upstream Linux Kernel Drivers

Below is how I previously tested the latest 4.17.0-rc kernel and Intel drm-tip kernel modules which may still be useful to others so I’m leaving it here.

Temporary Fix for the Intel Graphics Crash
The fix I found was to use the absolute latest Intel drm-tip git kernel code combined with a 4.17.0-rc5 Kernel build.  I then installed a Fedora development (rawhide kernel) and later copied in the compiled Intel kernel modules in afterwards, Yolo.

The full docs for setting up the latest Intel stack is here but I’m going to explain just the basics in case you are hitting this as well and want to get up and running quickly.

Build Kernel (Modules) Against drm-tip
This is going to clone a rather large git repository of all the upstream intel drm bits and build the latest kernel and modules.  Note that we’re ommitting the actual make install of the kernel, we only care about the modules.  You’re going to need this later.

First you’ll need some build and compiler tools, this is what I needed to install prior:

sudo dnf install openssl-devel automake gcc elfutils-libelf-devel zlib-devel flex bison

Next build the thing against the latest upstream Intel drm-tip repository.  This may take quite some time.  Ironically the GPU froze on me a few times trying to build the latest drivers that should supposedly contain the crash fix!  Maybe it could sense it.

export MY_DISTRO_PREFIX=/usr
export MY_DISTRO_LIBDIR=/usr/lib64
git clone git://anongit.freedesktop.org/drm-tip
cd drm-tip
make defconfig
sed -i 's/CONFIG_DRM_I915=y/CONFIG_DRM_I915=m/g' .config
sed -i 's/CONFIG_DRM=y/CONFIG_DRM=m/g' .config
sed -i 's/CONFIG_DRM_MIPI_DSI=y/CONFIG_DRM_MIPI_DSI=m/g' .config
sed -i 's/CONFIG_DRM_KMS_HELPER=y/CONFIG_DRM_KMS_HELPER=m/g' .config
sed -i 's/CONFIG_DRM_KMS_FB_HELPER=y/CONFIG_DRM_KMS_FB_HELPER=m/g' .config
sed -i 's/CONFIG_DRM_FBDEV_EMULATION=y/CONFIG_DRM_FBDEV_EMULATION=m/g' .config
sed -i 's/CONFIG_DRM_I915_CAPTURE_ERROR=y/CONFIG_DRM_I915_CAPTURE_ERROR=m/g' .config
sed -i 's/CONFIG_DRM_I915_COMPRESS_ERROR=y/CONFIG_DRM_I915_COMPRESS_ERROR=m/'g .config
sed -i 's/CONFIG_DRM_I915_USERPTR=y/CONFIG_DRM_I915_USERPTR=m/g' .config
sed -i 's/CONFIG_DRM_PANEL=y/CONFIG_DRM_PANEL=m/g' .config
sed -i 's/CONFIG_DRM_BRIDGE=y/CONFIG_DRM_BRIDGE=m .config
sed -i 's/CONFIG_DRM_PANEL_BRIDGE=y/CONFIG_DRM_PANEL_BRIDGE=m/g' .config
sed -i 's/CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y/CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=m/g' .config
make
sudo make modules_install

The last line will copy the kernel modules you create into /lib/modules/KERNEL_VERSION/kernel/drivers/gpu/drm along with a bunch of other kernel drivers we’re not going to need.

If you get a compilation issue about asm-goto support you’ll need to comment that out of arch/x86/Makefile and try again:

183 #ifndef CC_HAVE_ASM_GOTO
184 # $(error Compiler lacks asm-goto support.)
185 #endif

Install Rawhide Development Kernel
While you could just run the drm-tip Kernel chances are you’d need a whole lot more modules configured/enabled for your hardware.  I find it is much easier to just use your distributions latest kernel (if it matches latest upstream) as those are generally better configured for most hardware use cases and you’ll have everything reasonable provided as a loadable module.

You might substitute Rawhide here for your distributions development / bleeding edge Kernel like Tumbleweed for SuSE.  For Fedora users I am providing the direct paths here, which may change so double check the parent location.

cd /tmp/
wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm
wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-core-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm
wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-devel-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm
wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-headers-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm
wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-modules-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm
wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-modules-extra-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm

Install the Development 4.17+ Kernel
Now install the Rawhide development kernel and associated packages.

cd /tmp/
sudo dnf localinstall kernel-*.rpm

Copy the drm-tip Kernel GPU Modules Over
The current Rawhide kernel does not have the latest version of the Intel drivers that we’ll need that contain the actual fix so we’re going to copy them in manually.  This is fairly bad practice but in general but we don’t really care – we’d prefer something working to good etiquette that doesn’t.

Your paths and names may vary, but I copied over the entirety of the /lib/modules/KERNEL_MODULES_YOU_BUILT/kernel/drivers/gpu/drm/* into the modules location of the Rawhide kernel that I just installed.

sudo cp -Rv /lib/modules/4.17.0-rc5+/kernel/drivers/gpu/drm/* /lib/modules/4.17.0-0.rc5.git1.1.fc29.x86_64/kernel/drivers/gpu/drm/

Again, not the most elegant fix but gets the job done.  At this point just reboot into the new kernel and if you had crashes before due to the bug I was hitting hopefully they have gone away.

Usage and Testing
After ~20 hours of GPU torture tests (hundreds of glxgears spinning, open/close images in a shell loop, suspend and resume constantly over and over, old games raging in wine) things seem pretty stable.  Before I’d get hard GPU lockups anywhere from 6 minutes to 6 hours into normal desktop usage.

I realize this is a rather temporary blog post and I’m positive that all this will get fixed in upstream kernels.  For now it was important (and frustrating enough) to find a fix as soon as possible and then write about it.  I hope this helps someone else.

How to Debug an Intel GPU Crash
If you’re lucky enough to get logs or data written in case of a GPU crash there’s an easy way to gather debug information to file an Intel graphics driver bug.

sudo mount -tdebugfs debug /sys/kernel/debug
sudo cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state

 

About Will Foster

hobo devop/sysadmin/SRE
This entry was posted in open source, sysadmin and tagged , , , , , , , , . Bookmark the permalink.

56 Responses to Fix for Intel i915 GPU Freeze on Recent Linux Kernels

  1. venemo says:

    Hi!

    I’m experiencing a similar freeze on a Dell XPS 13 9370. Basically the screen turns blank and the laptop stops reacting to any input – does this sound familiar?

    Some questions:
    – Do you use Wayland or Xorg when this occours?
    – In case of Xorg, are you sure you actually used the Intel X11 driver? On Fedora it is not used on Skylake and newer hardware.

    Like

    • Will Foster says:

      Hey @venemo, yes this does sound similar to my issue minus the black screen – most of the time it just shudders a bit then freezes and laptop function keys respond but nothing else. Occasionally when resuming from suspend it will hang on a black screen (assuming my xscreensaver blank screen)

      I’m using xorg. Interesting, you are correct it seemed to have been using the Xorg intel driver.

      This makes me wonder if it’s instead a bug with the Xorg-x11-intel driver and has nothing to do with the i915/intel.

      Like

      • Will Foster says:

        I’m going to try forcing the Intel driver usage now via /etc/X11/xorg.conf.d/10-intel.conf


        Section "Device"
        Identifier "Intel Graphics"
        Driver "intel"
        EndSection

        I’m also testing out the following kernel parameter for Kabylake, I have no real scientific reason for this other than it was a workaround for a Baytrail bug that had similar symptoms.

        i915.enable_dc=0

        I’ll report back.

        Liked by 1 person

      • venemo says:

        For me there are two different freezes:
        1. most often it just freezes the current image on the screen, but with red stripes and dots at some places
        2. sometimes the screen goes completely blank

        Now interestingly some of these freezes are not actually the fault of the GPU, but rather my crappy Atheros wifi card. The ath10k wifi driver crashes and somehow takes the whole system with it. This has been to some degree mitigated by a firmware upgrade, but occasionally the firmware and/or the driver still crash. (But they don’t always bring down the entire system anymore.)

        In other cases however the crash occoured under graphics load, so is very unlikely (but not completely impossible) to be the fault of the wifi driver. In this case, yes, I suspect the Intel GPU to be at fault. Now in my previous experience on my previous laptop, the power saving parameters that everyone suggest to mindlessly apply to your kernel command line can and do cause instability. So, I’m currently experimenting with tweaking some kernel boot parameters that disable some power saving options.

        With regards to the Intel X11 driver: the reason why Fedora (and Debian, etc.) decided to disable it on Skylake and above is because it has had a bad reputation of being quite unstable. See here: https://hansdegoede.livejournal.com/16976.html – I will leave it at that for now, but of course am interested in your findings!

        Liked by 1 person

      • Will Foster says:

        Ok so far:

        * Kernel 4.10.11 with Intel driver = freeze
        * Kernel 4.10.11 with Intel driver and i915.enable_dc=0 = freeze

        Back to the drawing board, I will keep testing newer drm-tip modules and 4.17.0-rc kernels from rawhide I suppose.

        In the meantime if you could also comment here with your findings this might help:

        https://bugs.freedesktop.org/show_bug.cgi?id=102586

        Like

      • venemo says:

        So right now I’ve got these kernel parameters: i915.enable_fbc=0 i915.reset=1 i915.modeset=1 i915.disable_power_well=0 i915.enable_dc=0

        – enable_fbc and disable_power_well were enabled by default, so I disabled them
        – modeset was -1 by default I think
        – added enable_dc=0 at your suggestion

        Unfortunately these also make the kernel tainted… And now in addition to the freezes I started seeing gpu driver crashes in dmesg, with no real noticable benefits or fewer crashes.

        Like

      • venemo says:

        I’m starting to think that this is not (necessarily) a graphics problem (at least on my machine). Right now I’m testing a patch to the ath10k wifi driver which should eliminate at least part of the problem.

        Since messing around with those kernel parameters actually made the system less stable, I’m not using those anymore.

        Like

      • Will Foster says:

        Perhaps it may not be necessarily a GPU / Intel issue but I have an Intel wifi card for what it’s worth.

        Like

      • For me switching to i915 fixed the issue (linux 5.6.11). Since two days no blackouts. No extra kernel parameters. But I think package `xorg-x11-drv-intel` is the driver that we actually need because:

        > $ rpm -ql xorg-x11-drv-intel
        > /usr/bin/intel-virtual-output
        > /usr/lib/.build-id
        > /usr/lib/.build-id/27
        > /usr/lib/.build-id/27/0d688a5df5157e21c7e2969aa00fd8bdee6c40
        > /usr/lib/.build-id/31
        > /usr/lib/.build-id/31/b1ec46e7ef1a2d3f0fff9f6c5ae4f37d78bdd1
        > /usr/lib/.build-id/99
        > /usr/lib/.build-id/99/41e2ea929352993cf5269a16f147dda6c50fb0
        > /usr/lib/.build-id/d8
        > /usr/lib/.build-id/d8/273a81f8eee9665c2b52b8b6cdb644784f54d4
        > /usr/lib64/libIntelXvMC.so.1
        > /usr/lib64/libIntelXvMC.so.1.0.0
        > /usr/lib64/xorg/modules/drivers/intel_drv.so
        > /usr/libexec/xf86-video-intel-backlight-helper
        > /usr/share/doc/xorg-x11-drv-intel
        > /usr/share/doc/xorg-x11-drv-intel/COPYING
        > /usr/share/man/man4/intel-virtual-output.4.gz
        > /usr/share/man/man4/intel.4.gz
        > /usr/share/polkit-1/actions/org.x.xf86-video-intel.backlight-helper.policy

        And then in `man intel` I see:
        > SYNOPSIS
        > Section “Device”
        > Identifier “devname”
        > Driver “intel”
        > …
        > EndSection

        So I am wondering where is the “intel” driver that does not work and where is `i915` driver that does.

        Other than that switching between the two works as described in the blog. I’m just asking about packaging clarification.

        Like

  2. Will Foster says:

    I’m going to take a run at the latest drm-tip upstream code again and kernel-4.17.0-0.rc7.git0.1.fc29.x86_64.rpm from Rawhide and see if I have any luck, stay tuned.

    Like

    • venemo says:

      Just out of curiosity, did you try older kernels perhaps? Was there a version where this worked correctly?

      Like

    • Will Foster says:

      Ok I’ve got 4.17.0-rc7.git0.1.fc29.x86_64.rpm running now with drm-tip drm/i915 kernel modules as of 2018-05-30 running. So far so good, but then again I always say that.

      Like

      • venemo says:

        How is it going since then? Still OK? If yes, I might give drm-tip a try myself. How did you patch the Fedora kernel up to drm-tip?

        On my end, I’ve hooked up a serial port to see the messages from the kernel on another machine, hoping that something would turn up that points to the reasons for the random freezes. While this does yield useful info on the ath10k crashes it doesn’t show anything when the random freeze happens.

        At this point I suspect that I’ve simply got a faulty motherboard. But drm-tip could be still worth a try if it works for you. :)

        Like

      • Will Foster says:

        Hey Venemo,

        3 days and counting so far of stability, mostly my laptop has been suspended (I’m using hybrid-suspend). In the next few days I’ll use it more often and report back.

        Like

      • venemo says:

        What did you do to get drm-tip on top of the Fedora kernel? When I tried, git produced a patch file that wouldn’t apply.

        Like

      • Will Foster says:

        See the updated post, I had to comment out some lines in the Makefile because I got asm-goto errors:

        e.g. in

        arch/x86/Makefile

        #ifndef CC_HAVE_ASM_GOTO
        # $(error Compiler lacks asm-goto support.)
        #endif

        I also ensured anything related to i915 was compiled as ‘m’ or module for the subsequent cp operation into the existing 4.17.0 Rawhide kernel modules tree (I think I covered it all with sed at least in the post).

        So far things are stable, I’ve suspended and resumed about 10-15 times over 4 days and played some intensive wine / d3d games. I’m crossing my fingers but still keeping an eye on things.

        up 3 days, 23:23, 1 user, load average: 0.82, 0.84, 0.86

        Like

      • venemo says:

        Okay, I managed to do it: produced a patch using ‘git diff’ between 4.17-rc7 and latest drm-tip and then added the resulting patch file to the .spec file. The freeze is still there. I believe at this point I’ve exhausted all my options and should just accept that this is just a hardware defect.

        Like

      • Will Foster says:

        4.17 kernel is now stable/released, the only other changes I saw mentioned here besides the slew of i915 fixes in the various rc changelogs:

        drm/i915/lvds: Move acpi lid notification registration to
        registration phase
        drm/i915/query: Protect tainted function pointer lookup
        drm/i915/query: nospec expects no more than an unsigned long
        drm/i915: Disable LVDS on Radiant P845

        It’s discouraging that you still have the issue, I don’t want to return my work laptop for another (Lenovo?) model but if it is indeed a hardware defect perhaps they make one with an Nvidia card instead which honestly I’ve had no problems with on Linux. I really dig the x270 otherwise but this pretty much makes it unusable. I wonder if the same problem can be reproduced on Windows which would make it seem like it is for sure a hardware defect.

        What model / generation laptop do you have again? Just trying to correlate what other people are experiencing at work.

        Like

      • venemo says:

        This is a Dell XPS 13 9370. And I’m 99% sure this is just a random hardware defect.

        Like

      • Will Foster says:

        This is a Dell XPS 13 9370. And I’m 99% sure this is just a random hardware defect.

        This might sound crazy, but have you ever noticed if you pick up the laptop or move it (particularly pinch or hold one side of the bottom) that it freezes/crashes faster?

        Like

      • Will Foster says:

        Interesting aside here, I can replicate this issue by lightly squeezing the laptop base by the left wrist rest (below keyboard) where I believe the embedded Intel GPU or wiring may be. I was also able to get the same GPU freezes in a bootable Windows 10 USB stick just shaking things around a bit physically. Same thing in just the BIOS, though no freeze only artifacts.

        I believe I am also experiencing a hardware defect or faulty wiring – I’ve opened a support ticket with Lenovo and my x270 will be sent in for repair/replacement.

        Like

      • venemo says:

        There is only so much you can do about it. Since you can reproduce the problem in a straightforward way, you should have no trouble getting a warranty repair, possibly a motherboard replacement.

        The Intel GPU is actually integrated into the CPU (along with the chipset, voltage regulators and everything else these days), so it is very unlikely that you can squeeze the GPU itself. Besides, it is already being sqeezed by the heatsink. :)

        My guess would be that this kind of issue has to do with either damaged silicon (again, unlikely), or some sort of defect in the circuitry around it (eg. soldering error), or just damage to the motherboard PCB traces. When you squeeze it, it probably causes a short-circuit or a loose contact… or just could be that one of the ribbon cables half fell out of its connector.

        Like

      • Interesting enough I often (not always) saw the issue when reclining my elbow to the left of the touchpad. But this with 2 different models of Thinkpads. And now I have switched to i915 as described above, I don’t see the issue anymore (only 2 days testing though).

        I thought it is some kind of coincidence, still reading your comment it might be also a design deficiency of some kind too.

        Like

  3. venemo says:

    Hey Will, I’m just writing to let you know that I finally took the XPS back to the store and asked for a replacement. The new unit doesn’t exhibit any of the symptoms that the previous one had. I hope you will have similary good luck with your machine. :)

    Like

  4. ole says:

    Hey,

    i have a Thinkpad E580 (i8250 CPU, UHD Graphics 820) and similar problems: The system freezes, especially under graphics load (scrolling in Firefox), but sometimes even at the login screen.

    I tried to downgrade to kernel 4.9 – didn’t help.

    To me it doesn’t like a hardware defect as Windows 10 runs without any problems for some months of daily use now.

    Like

  5. I’ve been suffering with the random hangs for year+ on my Thinkpads with the newer chipset. Exactly as described here. I’m still on 4.8, so thinking < 4.9 will help is a lost cause.

    Like

  6. red says:

    Hi –

    I came across your post when dealing with a similar issue. After moving from my H270 platform to a Supermicro C236 Xeon platform, whenever I would use the QSV portions of the IGD graphics, I would get a hard lock. In fact, because a reset command sent from IPMI would not free the server, it suggests that there was a kernel deadlock.

    Below is the configuration that finally fixed everything for me. I have been going nearly 2 weeks without a single lock-up. During troubleshooting my issue, I ended up effectively replacing every single part in my server — so I can virtually guarantee that it wasn’t a H/W problem. At any rate, I hope my findings help you and/or others (especially since mine are Ubuntu related).

    Configuration:
    OS: Ubuntu 16.04.4 LTS
    Kernel: 4.17.0-994-generic (drm-tip)
    i915 srcversion: 9152110C37E0EC0FECD32F9

    *Note that running the drm-tip builds of the kernel represent that absolute latest driver versions and kernel version available. They shouldn’t be used on a production system without a full understanding of the risks.

    Enabling DMC, GUC, HUC:
    -Go here and download the appropriate binary blobs to the `/usr/lib/firmware/i915` folder.
    -Enable loading those binaries with the kernel: `”i915.enable_guc=2″` to the `GRUB_CMDLINE_LINUX_DEFAULT=` line in `/etc/default/grub`.
    -Reinitialize the ramdisk: `sudo update-initramfs -k all -u`
    -Update grub boot parameters: `sudo update-grub`.
    -Reboot

    Verifying DMC, GUC, HUC are enabled:
    -`sudo cat /sys/kernel/debug/dri/1/i915_guc_load_status` : The `status` field should say `fetch SUCCESS, load SUCCESS`
    -`sudo cat /sys/kernel/debug/dri/1/i915_huc_load_status` : The `status` field should say `fetch SUCCESS, load SUCCESS`
    -`dmesg | grep i915`: Should reflect successful load of DMC, HuC, and GuC
    *Note that the `1` between `dri` and `/i915…` signifies the Intel IGD device. It is possible this could be a different number (if ASPEED is disable or if another graphics card is present). You can list all of the video devices using `ls /dev/dri/`and select the number that corresponds to the IGD device.

    Like

  7. Phoenix says:

    I don’t know if my issue is related, but my Fedora 28 installation had the same symptoms.
    Intel i7 3630QM.
    journalctl -b -p err was reporting
    [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B

    I found this article: https://www.dedoimedo.com/computers/intel-microcode-atomic-update.html

    suggesting to disable psr, using a module configuration file i915.conf in /etc/modprobe.d
    containing the following option:
    options i915 enable_psr=0

    After setting this option, I did not face any freeze. However, I need some more time to be sure the problems has been fixed.

    Like

    • ole says:

      Doesn’t fix it here :(

      Frozen system after 2 min uptime.
      (Lenovo Thinkpad E580 /w Intel HD Graphics, Kernel 4.17.8).

      Like

      • Will Foster says:

        I ended up sending my laptop back for repair, they replaced the mainboard and SATA cables and I have no more issues now (using Fedora 28 and stock 4.17.x kernel) thus far.

        Like

      • ole says:

        Well, my Thinkpad came with Windows. And it works with Windows – no problem ever since February (daily use). Meanwhile Ubuntu still isn’t able to run without crash longer than 15 minutes. Most likely it’s just a (annoying!) Linux thing. I don’t think they would replace anything – the computer works ;-)

        Like

      • Will Foster says:

        Ah, I had the same problems in both Windows (via live USB stick) and even in the BIOS so it was definitely a hardware issue in my case. There’s a few kernel options you could try in the comments here, as well as making sure you’re using the actual kernel intel driver instead of the distribution provided driver in xorg.conf.

        Like

      • ole says:

        I think i tried almost all of them now – doesn’t work. Installing 4.18.6 doesn’t help as well. It’s still a “total freeze” – everythings stops, even SSH connections. No logs about it apparently.

        You can use the system for hours without using X – then again Linux will crash quite soon (1-2 min) while using Firefox. As I don’t have a second computer it’s hard to find out more about the problem on the Internet therefore – you have to reboot several times per hour ;-)

        Like

      • Will Foster says:

        What kind of GPU do you have? What Desktop Environment? If you’re using an Intel have you tried using uxa acceleration instead of sna?

        cat > /etc/X11/xorg.conf.d/20-intel.conf << EOF
        Section "Device"
        Identifier "Intel Graphics"
        Driver "intel"
        Option "AccelMethod" "uxa"
        EndSection
        EOF

        Like

      • ole says:

        Hej,
        thanks. At first it looked like it would fix it… after six minutes Ubuntu crashed again. :(

        It’s Intel HD Graphics 620. Ubuntu Mate.

        Like

      • ole says:

        Just updated to 4.19 an hour ago. No crash so far.

        Like

      • ole says:

        crashed after 65 minutes…

        Like

  8. Diego says:

    I had the same problem with a RAM module which got relatively loose. I found out by moving the laptop too.

    The effects are pretty similar to the GPU bugs. It is a different problem though. I have another laptop and it does freeze both in Debian and in Debian VMs but not with Windows. It’s a shame that there is such a bad support for GNU/Linux. I can’t rely on a laptop that will fail at any moment. Even if it’s once every 2 weeks uptime.

    Like

  9. mid says:

    i have a laptop hp x360 core i5 8gen with intel uhd, i fix it updating kernel to 4.18, for now no more freezing.

    Like

  10. caradoww says:

    Had this problem for a long while, but compiling that drm-tip kernel did the trick for me. I’ve been using it for many days, with no issues whatsoever.

    So that’s reassuring: at least in my case, it looks like it’s only a software issue, and the fix exists and will probably end up in the mainline kernel eventually. In the meantime, you just gotta compile that drm-tip linux kernel (and configure all the modules for your devices and stuff, look it up).

    Like

  11. Raju says:

    I have an Intel HD Graphics card in my laptop and the laptop screen works fine – it is just the external monitor that isn’t supported using any resolution other than 1280×800 – it used to support the 1920×1080 resolution prior to fedora 25, but, on the upgrade to fedora 26, I noticed the following message:

    Downgraded:
    xorg-x11-drv-nouveau.x86_64 1:1.0.15-1.fc26

    Prior to that everything worked just fine – then things started going awry, slowly disintegrating until now – I can’t run anything on my external monitor apart from 1280×800 mode. If I try switching modes – the external monitor will flicker badly and I cannot adjust anything – except if they are running on the laptop screen – as I can’t see anything on the external monitor except the message: “Input not Supported” or often – just a flickering screen.

    In between I had noticed that my screen flickering could be traced down to the wrong clocks being used when I ran xrandr. Essentially, when I had my screen working – I could run xrandr and get a predictable output and when it wasn’t working running xrandr showed an unpredictable output.

    I am not a fan of trying to recompile my kernel or chasing down a new graphics driver, as I believe the software was for some reason, intentionally changed and I believe there is no fix coming around any time soon. (My thoughts are to purchase a laptop that uses a non-Intel graphics chipset now – although, I find it hard to figure out if the other chipsets are supported well or not). I don’t believe that I have a hardware problem on my laptop either.

    I’ve documented much of my situation on the linux questions dot org bulletin board here:
    https://www.linuxquestions.org/questions/linux-laptop-and-netbook-25/getting-artifacts-and-occasional-signal-not-supported-messages-external-monitor-from-fc25-onwards-4175644093/

    Do take a look, if you are interested – as it has much more detail written out as well as some links to information that I came across regarding this issue.

    Like

  12. FH says:

    Hi, great post, thank you.

    I am getting graphic freeze ups on a Fedora 25 kernel 4.8 using an Intel HD Graphics 620 chip. LSPCI -v gives me “Kernel driver in use: i915” and “Kernel modules: i915”.

    Would you suggest starting a fix attempt with adding the kernel option GRUB_COMMANDLINE_LINUX=”i915.enable_psr=0″?

    Like

    • Will Foster says:

      Heya FH, I think it’s worth a try. I’m using this on my x270 with the i915. This disables some of the power saving features (I have not seen much of a loss in efficiency anectdotally) but solves flickering and other issues.

      https://wiki.archlinux.org/index.php/Intel_graphics#Screen_flickering

      I would give it a try and see, it can’t hurt.

      Like

      • FH says:

        Hi Will. Thanks again for the post. So what ended up completely solving this issue of random GPU freeze ups is tricking my computer into thinking there is a secondary display attached. My assumption is that what happens here is that the machine spins up a secondary graphics card or issues more power to the existing card which in turn eliminates these GPU freeze ups. For now I am using these https://www.headlessghost.com and they work perfectly. in the long run I will revisit this and see if I can change BIOS or kernel settings to achieve the same results. My machines is a Panasonic FZ-G1 MKIV, Intel HD Graphics 620 running Linux Redhat Fedora 25 64bit.

        Like

      • Will Foster says:

        Great to hear you got your issue solved. Typically most BIOS has at least an option or two for this like:

        1) Don’t require a display device to boot (probably not your case)
        2) Select the primary display device

        You might check if #2 is available in your BIOS, if you set it to the display device you primarily use it should care less about others, or disable the ones that aren’t always applicable that it picks up.

        Lastly, Fedora 25 is very old, you might consider upgrading to Fedora 30 or 31 just to see if the issue will fix itself.

        You can easily upgrade skipping Fedora releases like this (I’ve done it recently from 24 -> 29, then again from 29 -> 31)

        dnf upgrade –refresh
        dnf install dnf-plugin-system-upgrade
        dnf system-upgrade download –releasever=29

        Then do the same thing to go from 29 to 30 or 31.

        https://fedoramagazine.org/upgrading-fedora-29-to-fedora-30/

        Like

  13. Paul Seyerlein says:

    Hello,

    I also had issues with random GUI freezes on fedora 30 using cinnamon. My cpu usage was showing the Xorg process consuming the most, while basically being idle.

    The driver conf file /etc/X11/xorg.conf.d/10-intel.conf seems to fix my issue! The system runs smooth now. Thank you very much!!!

    BR,
    Paul

    Like

  14. rfrail3 says:

    Thanks for the information, adding “i915.enable_psr=0” works nicely!

    Like

  15. darktube says:

    Similar issue with my thinkpad w520. I have black screen with both linux and window system. I tried “i915.enable_psr=0” but didn’t work for me. It seems like an overheating issue as the blacking is more frequent when cpu temperature rise about 50C. I thus used thinkfan to let the fan run at maximum speed and cpufreqd to limit my maximum frequency to 2G. This seems to solved my problem and my cpu temperature is now around 40C.

    Like

  16. hirnukuono says:

    HP Elitebook 830 g7, Intel 620 uhd graphics. Ubuntu 20.10. Random freezes, mouse cursor stops, no keyboard (kb backlight does react to keypresses but nothing else), sysrq reisub not possible, only a full powercycle helps. Upgraded to 21.04, same thing. Installed the newest mainline kernel, downloaded newest firmwares from git, same thing. Tried newer ppa drivers, same thing. Grub cmdline max_cstates=1, same thing.

    Read this and a bunch of other articles and finally tried:
    GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash i915.enable_guc=2 i915.enable_psr=0″

    == zero crashes since.

    Fixed. Nice. Thank you very much for this :)

    Like

  17. Jej says:

    I discovered this freeze on a Dell 7490 + Debian 11 few minutes after installing ‘firmware-misc-nonfree’ package. I wanted to solve an issue about a i915 missing driver message during initrd updates. After removing this package and reinstalling kenel to force the initrd update, it looks ok…
    Hope it helps.

    Liked by 1 person

  18. RobinRobin says:

    I did a lot of Kernel Testing and reinstalling Distros to I finally made it here.
    Dell Latitude 7390 + Ubuntu 22.04, “i915.enable_psr=0” is my Lifesaver! When will it be fixed? The issue persists since 2018.. wtf..

    Like

Have a Squat, Leave a Reply ..

This site uses Akismet to reduce spam. Learn how your comment data is processed.