Recently Intel started including their graphics drivers into the mainline Linux kernel. This is great except when it stops working. Having suffered intermittent, sporadic GPU freezes on my Lenovo x270 (Kabylake) work laptop since Kernel 4.12+ I came across a bug report that seemed related. Here’s my temporary fix on Fedora 28 for getting things stable again until it’s fixed for good upstream.
The Intel Integrated Graphics Crash
I started having complete system freezes intermittently where my laptop display would shudder / jitter and then hard lock. There was no real pattern to this happening nor was I able to get any log file information or journal information about the issue as it completely froze. No network, no ping, nothing.
VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)
I’ve had this issue on both Skylake and Kabylake Intel-based laptop systems with integrated HD graphics (i915 driver) across Kernels 4.12+ through the 4.16.8 Fedora 28 Kernel. I recall this also happened on my previous Lenovo x240 though less frequently.
I am sure at some point this will be permanently fixed and disseminate down to all the major Linux distributions. I was impatient and frustrated and wanted something to work right away, here’s how I got there.
Unrelated – Micro Freezes on Linux 5.x Kernels and i915
Update 2019-09-09: The original issue I will walk through in this blog post is not related to recent micro-freezes on Linux 5.x kernels and i915 integrated GPU’s. There is a proposed patch pending for this here and I’ve been testing out the following kernel option to work around this right now
i915.enable_psr=0
Panel Self Refresh (PSR), a power saving feature used by Intel iGPUs is known to cause flickering in some instances FS#49628 FS#49371 FS#50605. A temporary solution is to disable this feature using the kernel parameter i915.enable_psr=0.
Update: 2018-05-28: It seems that as of Skylake Fedora no longer uses the intel/i915 driver by default if you’re using Xorg. It instead uses the xorg-x11-drv-intel driver which means the testing below is not relevant currently (thanks to Venemo in the comments).
I doing more testing now against the stock 4.16.10 kernel using the Intel driver in lieu of the xorg-x11-drv-intel driver.
To switch to the intel driver make yourself an xorg shim config and then restart Xorg or reboot.
cat > /etc/X11/xorg.conf.d/10-intel.conf <<EOF Section "Device" Identifier "Intel Graphics" Driver "intel" EndSection EOF
If you were running the xorg Intel drivers you’d have seen something like this in /var/log/Xorg.0.log:
[ 18.547] X.Org Video Driver: 23.0 [ 18.547] X.Org XInput driver : 24.1 [ 18.547] X.Org Server Extension : 10.0
If you are now running the Intel i915 driver you’d see this instead:
[ 17.369] (II) intel: Driver for Intel(R) Integrated Graphics Chipsets: i810, i810-dc100, i810e, i815, i830M, 845G, 854, 852GM/855GM, 865G, 915G, E7221 (i915), 915GM, 945G, 945GM, 945GME, Pineview GM, Pineview G, 965G, G35, 965Q, 946GZ, 965GM, 965GME/GLE, G33, Q35, Q33, GM45, 4 Series, G45/G43, Q45/Q43, G41, B43 [ 17.369] (II) intel: Driver for Intel(R) HD Graphics [ 17.369] (II) intel: Driver for Intel(R) Iris(TM) Graphics [ 17.369] (II) intel: Driver for Intel(R) Iris(TM) Pro Graphics [ 17.370] (II) intel(0): Using Kernel Mode Setting driver: i915, version 1.6.0 20171222 [ 17.372] (--) intel(0): Integrated Graphics Chipset: Intel(R) HD Graphics 620
Update: 2018-05-29: I’m still getting GPU freezes with the 4.16.11 Fedora 28 kernel and the Intel i915 driver.
I have also tried the following kernel parameter which doesn’t help:
i915.enable_dc=0
I am now testing the latest drm-tip against the Fedora Rawhide 4.17.0-rc kernels as I have time to hopefully see if/when a fix appears.
Update: 2018-06-04: So far things have been stable for 4 days on 4.17.0-rc7 and latest Intel drm-tip kernel tree modules copied in, I will keep this updated if I have another freeze.
Commenter Venemo has stated he’s still getting this freeze on 4.17.0-rc7 however on another system.
Update: 2018-06-05: I experienced another GPU freeze but it took about ~5 days of normal usage and dozens of suspend/resumes. For giggles I tested trying to trigger this in the BIOS and was able to get screen artifacts by jerking the laptop around and also by squeezing (a normal amount) the palm rest area of the laptop.
I also tried a Windows 10 USB stateless image and indeed I get the same GPU freezes there too. This makes be believe it’s a hardware defect. I’ve filed a ticket with Lenovo and I’ll be mailing my laptop in for repair/replacement – I’ll let ya know how it goes.
Update: 2018-07-17: I received my laptop back from repair and the motherboard and SATA cables were replaced. I’ve had zero issues for almost a month now on 4.17.7+ kernel on Fedora28. Here’s the work receipt from Lenovo (replaced assemblies PCB, motherboard, cables, wire):
I am leaving the rest of this blog post / guide up in case it might be useful for someone tracing down Intel GPU issues on Linux or filing bugs against upstream.
Testing Intel Upstream Linux Kernel Drivers
Below is how I previously tested the latest 4.17.0-rc kernel and Intel drm-tip kernel modules which may still be useful to others so I’m leaving it here.
Temporary Fix for the Intel Graphics Crash
The fix I found was to use the absolute latest Intel drm-tip git kernel code combined with a 4.17.0-rc5 Kernel build. I then installed a Fedora development (rawhide kernel) and later copied in the compiled Intel kernel modules in afterwards, Yolo.
The full docs for setting up the latest Intel stack is here but I’m going to explain just the basics in case you are hitting this as well and want to get up and running quickly.
Build Kernel (Modules) Against drm-tip
This is going to clone a rather large git repository of all the upstream intel drm bits and build the latest kernel and modules. Note that we’re ommitting the actual make install of the kernel, we only care about the modules. You’re going to need this later.
First you’ll need some build and compiler tools, this is what I needed to install prior:
sudo dnf install openssl-devel automake gcc elfutils-libelf-devel zlib-devel flex bison
Next build the thing against the latest upstream Intel drm-tip repository. This may take quite some time. Ironically the GPU froze on me a few times trying to build the latest drivers that should supposedly contain the crash fix! Maybe it could sense it.
export MY_DISTRO_PREFIX=/usr export MY_DISTRO_LIBDIR=/usr/lib64 git clone git://anongit.freedesktop.org/drm-tip cd drm-tip make defconfig sed -i 's/CONFIG_DRM_I915=y/CONFIG_DRM_I915=m/g' .config sed -i 's/CONFIG_DRM=y/CONFIG_DRM=m/g' .config sed -i 's/CONFIG_DRM_MIPI_DSI=y/CONFIG_DRM_MIPI_DSI=m/g' .config sed -i 's/CONFIG_DRM_KMS_HELPER=y/CONFIG_DRM_KMS_HELPER=m/g' .config sed -i 's/CONFIG_DRM_KMS_FB_HELPER=y/CONFIG_DRM_KMS_FB_HELPER=m/g' .config sed -i 's/CONFIG_DRM_FBDEV_EMULATION=y/CONFIG_DRM_FBDEV_EMULATION=m/g' .config sed -i 's/CONFIG_DRM_I915_CAPTURE_ERROR=y/CONFIG_DRM_I915_CAPTURE_ERROR=m/g' .config sed -i 's/CONFIG_DRM_I915_COMPRESS_ERROR=y/CONFIG_DRM_I915_COMPRESS_ERROR=m/'g .config sed -i 's/CONFIG_DRM_I915_USERPTR=y/CONFIG_DRM_I915_USERPTR=m/g' .config sed -i 's/CONFIG_DRM_PANEL=y/CONFIG_DRM_PANEL=m/g' .config sed -i 's/CONFIG_DRM_BRIDGE=y/CONFIG_DRM_BRIDGE=m .config sed -i 's/CONFIG_DRM_PANEL_BRIDGE=y/CONFIG_DRM_PANEL_BRIDGE=m/g' .config sed -i 's/CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=y/CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=m/g' .config make sudo make modules_install
The last line will copy the kernel modules you create into /lib/modules/KERNEL_VERSION/kernel/drivers/gpu/drm along with a bunch of other kernel drivers we’re not going to need.
If you get a compilation issue about asm-goto support you’ll need to comment that out of arch/x86/Makefile and try again:
183 #ifndef CC_HAVE_ASM_GOTO 184 # $(error Compiler lacks asm-goto support.) 185 #endif
Install Rawhide Development Kernel
While you could just run the drm-tip Kernel chances are you’d need a whole lot more modules configured/enabled for your hardware. I find it is much easier to just use your distributions latest kernel (if it matches latest upstream) as those are generally better configured for most hardware use cases and you’ll have everything reasonable provided as a loadable module.
You might substitute Rawhide here for your distributions development / bleeding edge Kernel like Tumbleweed for SuSE. For Fedora users I am providing the direct paths here, which may change so double check the parent location.
cd /tmp/ wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-core-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-devel-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-headers-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-modules-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm wget http://download-ib01.fedoraproject.org/pub/fedora/linux/development/rawhide/Workstation/x86_64/os/Packages/k/kernel-modules-extra-4.17.0-0.rc5.git1.1.fc29.x86_64.rpm
Install the Development 4.17+ Kernel
Now install the Rawhide development kernel and associated packages.
cd /tmp/ sudo dnf localinstall kernel-*.rpm
Copy the drm-tip Kernel GPU Modules Over
The current Rawhide kernel does not have the latest version of the Intel drivers that we’ll need that contain the actual fix so we’re going to copy them in manually. This is fairly bad practice but in general but we don’t really care – we’d prefer something working to good etiquette that doesn’t.
Your paths and names may vary, but I copied over the entirety of the /lib/modules/KERNEL_MODULES_YOU_BUILT/kernel/drivers/gpu/drm/* into the modules location of the Rawhide kernel that I just installed.
sudo cp -Rv /lib/modules/4.17.0-rc5+/kernel/drivers/gpu/drm/* /lib/modules/4.17.0-0.rc5.git1.1.fc29.x86_64/kernel/drivers/gpu/drm/
Again, not the most elegant fix but gets the job done. At this point just reboot into the new kernel and if you had crashes before due to the bug I was hitting hopefully they have gone away.
Usage and Testing
After ~20 hours of GPU torture tests (hundreds of glxgears spinning, open/close images in a shell loop, suspend and resume constantly over and over, old games raging in wine) things seem pretty stable. Before I’d get hard GPU lockups anywhere from 6 minutes to 6 hours into normal desktop usage.
I realize this is a rather temporary blog post and I’m positive that all this will get fixed in upstream kernels. For now it was important (and frustrating enough) to find a fix as soon as possible and then write about it. I hope this helps someone else.
How to Debug an Intel GPU Crash
If you’re lucky enough to get logs or data written in case of a GPU crash there’s an easy way to gather debug information to file an Intel graphics driver bug.
sudo mount -tdebugfs debug /sys/kernel/debug sudo cat /sys/kernel/debug/dri/0/i915_error_state > i915_error_state
Hi!
I’m experiencing a similar freeze on a Dell XPS 13 9370. Basically the screen turns blank and the laptop stops reacting to any input – does this sound familiar?
Some questions:
– Do you use Wayland or Xorg when this occours?
– In case of Xorg, are you sure you actually used the Intel X11 driver? On Fedora it is not used on Skylake and newer hardware.
LikeLike
Hey @venemo, yes this does sound similar to my issue minus the black screen – most of the time it just shudders a bit then freezes and laptop function keys respond but nothing else. Occasionally when resuming from suspend it will hang on a black screen (assuming my xscreensaver blank screen)
I’m using xorg. Interesting, you are correct it seemed to have been using the Xorg intel driver.
This makes me wonder if it’s instead a bug with the Xorg-x11-intel driver and has nothing to do with the i915/intel.
LikeLike
I’m going to try forcing the Intel driver usage now via /etc/X11/xorg.conf.d/10-intel.conf
Section "Device"
Identifier "Intel Graphics"
Driver "intel"
EndSection
I’m also testing out the following kernel parameter for Kabylake, I have no real scientific reason for this other than it was a workaround for a Baytrail bug that had similar symptoms.
i915.enable_dc=0
I’ll report back.
LikeLiked by 1 person
For me there are two different freezes:
1. most often it just freezes the current image on the screen, but with red stripes and dots at some places
2. sometimes the screen goes completely blank
Now interestingly some of these freezes are not actually the fault of the GPU, but rather my crappy Atheros wifi card. The ath10k wifi driver crashes and somehow takes the whole system with it. This has been to some degree mitigated by a firmware upgrade, but occasionally the firmware and/or the driver still crash. (But they don’t always bring down the entire system anymore.)
In other cases however the crash occoured under graphics load, so is very unlikely (but not completely impossible) to be the fault of the wifi driver. In this case, yes, I suspect the Intel GPU to be at fault. Now in my previous experience on my previous laptop, the power saving parameters that everyone suggest to mindlessly apply to your kernel command line can and do cause instability. So, I’m currently experimenting with tweaking some kernel boot parameters that disable some power saving options.
With regards to the Intel X11 driver: the reason why Fedora (and Debian, etc.) decided to disable it on Skylake and above is because it has had a bad reputation of being quite unstable. See here: https://hansdegoede.livejournal.com/16976.html – I will leave it at that for now, but of course am interested in your findings!
LikeLiked by 1 person
Ok so far:
* Kernel 4.10.11 with Intel driver = freeze
* Kernel 4.10.11 with Intel driver and i915.enable_dc=0 = freeze
Back to the drawing board, I will keep testing newer drm-tip modules and 4.17.0-rc kernels from rawhide I suppose.
In the meantime if you could also comment here with your findings this might help:
https://bugs.freedesktop.org/show_bug.cgi?id=102586
LikeLike
So right now I’ve got these kernel parameters: i915.enable_fbc=0 i915.reset=1 i915.modeset=1 i915.disable_power_well=0 i915.enable_dc=0
– enable_fbc and disable_power_well were enabled by default, so I disabled them
– modeset was -1 by default I think
– added enable_dc=0 at your suggestion
Unfortunately these also make the kernel tainted… And now in addition to the freezes I started seeing gpu driver crashes in dmesg, with no real noticable benefits or fewer crashes.
LikeLike
I’m starting to think that this is not (necessarily) a graphics problem (at least on my machine). Right now I’m testing a patch to the ath10k wifi driver which should eliminate at least part of the problem.
Since messing around with those kernel parameters actually made the system less stable, I’m not using those anymore.
LikeLike
Perhaps it may not be necessarily a GPU / Intel issue but I have an Intel wifi card for what it’s worth.
LikeLike
For me switching to i915 fixed the issue (linux 5.6.11). Since two days no blackouts. No extra kernel parameters. But I think package `xorg-x11-drv-intel` is the driver that we actually need because:
> $ rpm -ql xorg-x11-drv-intel
> /usr/bin/intel-virtual-output
> /usr/lib/.build-id
> /usr/lib/.build-id/27
> /usr/lib/.build-id/27/0d688a5df5157e21c7e2969aa00fd8bdee6c40
> /usr/lib/.build-id/31
> /usr/lib/.build-id/31/b1ec46e7ef1a2d3f0fff9f6c5ae4f37d78bdd1
> /usr/lib/.build-id/99
> /usr/lib/.build-id/99/41e2ea929352993cf5269a16f147dda6c50fb0
> /usr/lib/.build-id/d8
> /usr/lib/.build-id/d8/273a81f8eee9665c2b52b8b6cdb644784f54d4
> /usr/lib64/libIntelXvMC.so.1
> /usr/lib64/libIntelXvMC.so.1.0.0
> /usr/lib64/xorg/modules/drivers/intel_drv.so
> /usr/libexec/xf86-video-intel-backlight-helper
> /usr/share/doc/xorg-x11-drv-intel
> /usr/share/doc/xorg-x11-drv-intel/COPYING
> /usr/share/man/man4/intel-virtual-output.4.gz
> /usr/share/man/man4/intel.4.gz
> /usr/share/polkit-1/actions/org.x.xf86-video-intel.backlight-helper.policy
And then in `man intel` I see:
> SYNOPSIS
> Section “Device”
> Identifier “devname”
> Driver “intel”
> …
> EndSection
So I am wondering where is the “intel” driver that does not work and where is `i915` driver that does.
Other than that switching between the two works as described in the blog. I’m just asking about packaging clarification.
LikeLike
I’m going to take a run at the latest drm-tip upstream code again and kernel-4.17.0-0.rc7.git0.1.fc29.x86_64.rpm from Rawhide and see if I have any luck, stay tuned.
LikeLike
Just out of curiosity, did you try older kernels perhaps? Was there a version where this worked correctly?
LikeLike
Yeah I tried 4.11 and 4.8.6 Fedora kernels as well.
LikeLike
Ok I’ve got 4.17.0-rc7.git0.1.fc29.x86_64.rpm running now with drm-tip drm/i915 kernel modules as of 2018-05-30 running. So far so good, but then again I always say that.
LikeLike
How is it going since then? Still OK? If yes, I might give drm-tip a try myself. How did you patch the Fedora kernel up to drm-tip?
On my end, I’ve hooked up a serial port to see the messages from the kernel on another machine, hoping that something would turn up that points to the reasons for the random freezes. While this does yield useful info on the ath10k crashes it doesn’t show anything when the random freeze happens.
At this point I suspect that I’ve simply got a faulty motherboard. But drm-tip could be still worth a try if it works for you. :)
LikeLike
Hey Venemo,
3 days and counting so far of stability, mostly my laptop has been suspended (I’m using hybrid-suspend). In the next few days I’ll use it more often and report back.
LikeLike
What did you do to get drm-tip on top of the Fedora kernel? When I tried, git produced a patch file that wouldn’t apply.
LikeLike
See the updated post, I had to comment out some lines in the Makefile because I got asm-goto errors:
e.g. in
arch/x86/Makefile
#ifndef CC_HAVE_ASM_GOTO
# $(error Compiler lacks asm-goto support.)
#endif
I also ensured anything related to i915 was compiled as ‘m’ or module for the subsequent cp operation into the existing 4.17.0 Rawhide kernel modules tree (I think I covered it all with sed at least in the post).
So far things are stable, I’ve suspended and resumed about 10-15 times over 4 days and played some intensive wine / d3d games. I’m crossing my fingers but still keeping an eye on things.
up 3 days, 23:23, 1 user, load average: 0.82, 0.84, 0.86
LikeLike
Okay, I managed to do it: produced a patch using ‘git diff’ between 4.17-rc7 and latest drm-tip and then added the resulting patch file to the .spec file. The freeze is still there. I believe at this point I’ve exhausted all my options and should just accept that this is just a hardware defect.
LikeLike
4.17 kernel is now stable/released, the only other changes I saw mentioned here besides the slew of i915 fixes in the various rc changelogs:
drm/i915/lvds: Move acpi lid notification registration to
registration phase
drm/i915/query: Protect tainted function pointer lookup
drm/i915/query: nospec expects no more than an unsigned long
drm/i915: Disable LVDS on Radiant P845
It’s discouraging that you still have the issue, I don’t want to return my work laptop for another (Lenovo?) model but if it is indeed a hardware defect perhaps they make one with an Nvidia card instead which honestly I’ve had no problems with on Linux. I really dig the x270 otherwise but this pretty much makes it unusable. I wonder if the same problem can be reproduced on Windows which would make it seem like it is for sure a hardware defect.
What model / generation laptop do you have again? Just trying to correlate what other people are experiencing at work.
LikeLike
This is a Dell XPS 13 9370. And I’m 99% sure this is just a random hardware defect.
LikeLike
This might sound crazy, but have you ever noticed if you pick up the laptop or move it (particularly pinch or hold one side of the bottom) that it freezes/crashes faster?
LikeLike
Interesting aside here, I can replicate this issue by lightly squeezing the laptop base by the left wrist rest (below keyboard) where I believe the embedded Intel GPU or wiring may be. I was also able to get the same GPU freezes in a bootable Windows 10 USB stick just shaking things around a bit physically. Same thing in just the BIOS, though no freeze only artifacts.
I believe I am also experiencing a hardware defect or faulty wiring – I’ve opened a support ticket with Lenovo and my x270 will be sent in for repair/replacement.
LikeLike
There is only so much you can do about it. Since you can reproduce the problem in a straightforward way, you should have no trouble getting a warranty repair, possibly a motherboard replacement.
The Intel GPU is actually integrated into the CPU (along with the chipset, voltage regulators and everything else these days), so it is very unlikely that you can squeeze the GPU itself. Besides, it is already being sqeezed by the heatsink. :)
My guess would be that this kind of issue has to do with either damaged silicon (again, unlikely), or some sort of defect in the circuitry around it (eg. soldering error), or just damage to the motherboard PCB traces. When you squeeze it, it probably causes a short-circuit or a loose contact… or just could be that one of the ribbon cables half fell out of its connector.
LikeLike
Interesting enough I often (not always) saw the issue when reclining my elbow to the left of the touchpad. But this with 2 different models of Thinkpads. And now I have switched to i915 as described above, I don’t see the issue anymore (only 2 days testing though).
I thought it is some kind of coincidence, still reading your comment it might be also a design deficiency of some kind too.
LikeLike
Hey Will, I’m just writing to let you know that I finally took the XPS back to the store and asked for a replacement. The new unit doesn’t exhibit any of the symptoms that the previous one had. I hope you will have similary good luck with your machine. :)
LikeLike
Hey @venemo, I’m glad to hear you got sorted. My Lenovo x270 is still out for repair but I hope I can report similar findings soon.
LikeLike
Hey,
i have a Thinkpad E580 (i8250 CPU, UHD Graphics 820) and similar problems: The system freezes, especially under graphics load (scrolling in Firefox), but sometimes even at the login screen.
I tried to downgrade to kernel 4.9 – didn’t help.
To me it doesn’t like a hardware defect as Windows 10 runs without any problems for some months of daily use now.
LikeLike
I’ve been suffering with the random hangs for year+ on my Thinkpads with the newer chipset. Exactly as described here. I’m still on 4.8, so thinking < 4.9 will help is a lost cause.
LikeLike
Hi –
I came across your post when dealing with a similar issue. After moving from my H270 platform to a Supermicro C236 Xeon platform, whenever I would use the QSV portions of the IGD graphics, I would get a hard lock. In fact, because a reset command sent from IPMI would not free the server, it suggests that there was a kernel deadlock.
Below is the configuration that finally fixed everything for me. I have been going nearly 2 weeks without a single lock-up. During troubleshooting my issue, I ended up effectively replacing every single part in my server — so I can virtually guarantee that it wasn’t a H/W problem. At any rate, I hope my findings help you and/or others (especially since mine are Ubuntu related).
Configuration:
OS: Ubuntu 16.04.4 LTS
Kernel: 4.17.0-994-generic (drm-tip)
i915 srcversion: 9152110C37E0EC0FECD32F9
*Note that running the drm-tip builds of the kernel represent that absolute latest driver versions and kernel version available. They shouldn’t be used on a production system without a full understanding of the risks.
Enabling DMC, GUC, HUC:
-Go here and download the appropriate binary blobs to the `/usr/lib/firmware/i915` folder.
-Enable loading those binaries with the kernel: `”i915.enable_guc=2″` to the `GRUB_CMDLINE_LINUX_DEFAULT=` line in `/etc/default/grub`.
-Reinitialize the ramdisk: `sudo update-initramfs -k all -u`
-Update grub boot parameters: `sudo update-grub`.
-Reboot
Verifying DMC, GUC, HUC are enabled:
-`sudo cat /sys/kernel/debug/dri/1/i915_guc_load_status` : The `status` field should say `fetch SUCCESS, load SUCCESS`
-`sudo cat /sys/kernel/debug/dri/1/i915_huc_load_status` : The `status` field should say `fetch SUCCESS, load SUCCESS`
-`dmesg | grep i915`: Should reflect successful load of DMC, HuC, and GuC
*Note that the `1` between `dri` and `/i915…` signifies the Intel IGD device. It is possible this could be a different number (if ASPEED is disable or if another graphics card is present). You can list all of the video devices using `ls /dev/dri/`and select the number that corresponds to the IGD device.
LikeLike
I don’t know if my issue is related, but my Fedora 28 installation had the same symptoms.
Intel i7 3630QM.
journalctl -b -p err was reporting
[drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B
I found this article: https://www.dedoimedo.com/computers/intel-microcode-atomic-update.html
suggesting to disable psr, using a module configuration file i915.conf in /etc/modprobe.d
containing the following option:
options i915 enable_psr=0
After setting this option, I did not face any freeze. However, I need some more time to be sure the problems has been fixed.
LikeLike
Doesn’t fix it here :(
Frozen system after 2 min uptime.
(Lenovo Thinkpad E580 /w Intel HD Graphics, Kernel 4.17.8).
LikeLike
I ended up sending my laptop back for repair, they replaced the mainboard and SATA cables and I have no more issues now (using Fedora 28 and stock 4.17.x kernel) thus far.
LikeLike
Well, my Thinkpad came with Windows. And it works with Windows – no problem ever since February (daily use). Meanwhile Ubuntu still isn’t able to run without crash longer than 15 minutes. Most likely it’s just a (annoying!) Linux thing. I don’t think they would replace anything – the computer works ;-)
LikeLike
Ah, I had the same problems in both Windows (via live USB stick) and even in the BIOS so it was definitely a hardware issue in my case. There’s a few kernel options you could try in the comments here, as well as making sure you’re using the actual kernel intel driver instead of the distribution provided driver in xorg.conf.
LikeLike
I think i tried almost all of them now – doesn’t work. Installing 4.18.6 doesn’t help as well. It’s still a “total freeze” – everythings stops, even SSH connections. No logs about it apparently.
You can use the system for hours without using X – then again Linux will crash quite soon (1-2 min) while using Firefox. As I don’t have a second computer it’s hard to find out more about the problem on the Internet therefore – you have to reboot several times per hour ;-)
LikeLike
What kind of GPU do you have? What Desktop Environment? If you’re using an Intel have you tried using uxa acceleration instead of sna?
cat > /etc/X11/xorg.conf.d/20-intel.conf << EOF
Section "Device"
Identifier "Intel Graphics"
Driver "intel"
Option "AccelMethod" "uxa"
EndSection
EOF
LikeLike
Hej,
thanks. At first it looked like it would fix it… after six minutes Ubuntu crashed again. :(
It’s Intel HD Graphics 620. Ubuntu Mate.
LikeLike
Just updated to 4.19 an hour ago. No crash so far.
LikeLike
crashed after 65 minutes…
LikeLike
I had the same problem with a RAM module which got relatively loose. I found out by moving the laptop too.
The effects are pretty similar to the GPU bugs. It is a different problem though. I have another laptop and it does freeze both in Debian and in Debian VMs but not with Windows. It’s a shame that there is such a bad support for GNU/Linux. I can’t rely on a laptop that will fail at any moment. Even if it’s once every 2 weeks uptime.
LikeLike
i have a laptop hp x360 core i5 8gen with intel uhd, i fix it updating kernel to 4.18, for now no more freezing.
LikeLike
Glad to hear it’s working so far, what kernel / distribution were you using before when it had issues? Keep us updated.
LikeLike
Had this problem for a long while, but compiling that drm-tip kernel did the trick for me. I’ve been using it for many days, with no issues whatsoever.
So that’s reassuring: at least in my case, it looks like it’s only a software issue, and the fix exists and will probably end up in the mainline kernel eventually. In the meantime, you just gotta compile that drm-tip linux kernel (and configure all the modules for your devices and stuff, look it up).
LikeLike
I have an Intel HD Graphics card in my laptop and the laptop screen works fine – it is just the external monitor that isn’t supported using any resolution other than 1280×800 – it used to support the 1920×1080 resolution prior to fedora 25, but, on the upgrade to fedora 26, I noticed the following message:
Downgraded:
xorg-x11-drv-nouveau.x86_64 1:1.0.15-1.fc26
Prior to that everything worked just fine – then things started going awry, slowly disintegrating until now – I can’t run anything on my external monitor apart from 1280×800 mode. If I try switching modes – the external monitor will flicker badly and I cannot adjust anything – except if they are running on the laptop screen – as I can’t see anything on the external monitor except the message: “Input not Supported” or often – just a flickering screen.
In between I had noticed that my screen flickering could be traced down to the wrong clocks being used when I ran xrandr. Essentially, when I had my screen working – I could run xrandr and get a predictable output and when it wasn’t working running xrandr showed an unpredictable output.
I am not a fan of trying to recompile my kernel or chasing down a new graphics driver, as I believe the software was for some reason, intentionally changed and I believe there is no fix coming around any time soon. (My thoughts are to purchase a laptop that uses a non-Intel graphics chipset now – although, I find it hard to figure out if the other chipsets are supported well or not). I don’t believe that I have a hardware problem on my laptop either.
I’ve documented much of my situation on the linux questions dot org bulletin board here:
https://www.linuxquestions.org/questions/linux-laptop-and-netbook-25/getting-artifacts-and-occasional-signal-not-supported-messages-external-monitor-from-fc25-onwards-4175644093/
Do take a look, if you are interested – as it has much more detail written out as well as some links to information that I came across regarding this issue.
LikeLike
Hi, great post, thank you.
I am getting graphic freeze ups on a Fedora 25 kernel 4.8 using an Intel HD Graphics 620 chip. LSPCI -v gives me “Kernel driver in use: i915” and “Kernel modules: i915”.
Would you suggest starting a fix attempt with adding the kernel option GRUB_COMMANDLINE_LINUX=”i915.enable_psr=0″?
LikeLike
Heya FH, I think it’s worth a try. I’m using this on my x270 with the i915. This disables some of the power saving features (I have not seen much of a loss in efficiency anectdotally) but solves flickering and other issues.
https://wiki.archlinux.org/index.php/Intel_graphics#Screen_flickering
I would give it a try and see, it can’t hurt.
LikeLike
Hi Will. Thanks again for the post. So what ended up completely solving this issue of random GPU freeze ups is tricking my computer into thinking there is a secondary display attached. My assumption is that what happens here is that the machine spins up a secondary graphics card or issues more power to the existing card which in turn eliminates these GPU freeze ups. For now I am using these https://www.headlessghost.com and they work perfectly. in the long run I will revisit this and see if I can change BIOS or kernel settings to achieve the same results. My machines is a Panasonic FZ-G1 MKIV, Intel HD Graphics 620 running Linux Redhat Fedora 25 64bit.
LikeLike
Great to hear you got your issue solved. Typically most BIOS has at least an option or two for this like:
1) Don’t require a display device to boot (probably not your case)
2) Select the primary display device
You might check if #2 is available in your BIOS, if you set it to the display device you primarily use it should care less about others, or disable the ones that aren’t always applicable that it picks up.
Lastly, Fedora 25 is very old, you might consider upgrading to Fedora 30 or 31 just to see if the issue will fix itself.
You can easily upgrade skipping Fedora releases like this (I’ve done it recently from 24 -> 29, then again from 29 -> 31)
dnf upgrade –refresh
dnf install dnf-plugin-system-upgrade
dnf system-upgrade download –releasever=29
Then do the same thing to go from 29 to 30 or 31.
https://fedoramagazine.org/upgrading-fedora-29-to-fedora-30/
LikeLike
Hello,
I also had issues with random GUI freezes on fedora 30 using cinnamon. My cpu usage was showing the Xorg process consuming the most, while basically being idle.
The driver conf file /etc/X11/xorg.conf.d/10-intel.conf seems to fix my issue! The system runs smooth now. Thank you very much!!!
BR,
Paul
LikeLike
Thanks for the information, adding “i915.enable_psr=0” works nicely!
LikeLike
Similar issue with my thinkpad w520. I have black screen with both linux and window system. I tried “i915.enable_psr=0” but didn’t work for me. It seems like an overheating issue as the blacking is more frequent when cpu temperature rise about 50C. I thus used thinkfan to let the fan run at maximum speed and cpufreqd to limit my maximum frequency to 2G. This seems to solved my problem and my cpu temperature is now around 40C.
LikeLike
Strange, 50C isn’t really that hot – but if you can correlate it and it happens maybe you’re onto something.
LikeLike
HP Elitebook 830 g7, Intel 620 uhd graphics. Ubuntu 20.10. Random freezes, mouse cursor stops, no keyboard (kb backlight does react to keypresses but nothing else), sysrq reisub not possible, only a full powercycle helps. Upgraded to 21.04, same thing. Installed the newest mainline kernel, downloaded newest firmwares from git, same thing. Tried newer ppa drivers, same thing. Grub cmdline max_cstates=1, same thing.
Read this and a bunch of other articles and finally tried:
GRUB_CMDLINE_LINUX_DEFAULT=”quiet splash i915.enable_guc=2 i915.enable_psr=0″
== zero crashes since.
Fixed. Nice. Thank you very much for this :)
LikeLike
Hey Hirnukuono, I’m so glad this fixed your problem!
LikeLiked by 1 person
I discovered this freeze on a Dell 7490 + Debian 11 few minutes after installing ‘firmware-misc-nonfree’ package. I wanted to solve an issue about a i915 missing driver message during initrd updates. After removing this package and reinstalling kenel to force the initrd update, it looks ok…
Hope it helps.
LikeLiked by 1 person
I did a lot of Kernel Testing and reinstalling Distros to I finally made it here.
Dell Latitude 7390 + Ubuntu 22.04, “i915.enable_psr=0” is my Lifesaver! When will it be fixed? The issue persists since 2018.. wtf..
LikeLike