Thursday, September 8, 2011

Can Linux Kill Your Hardware - A Warning to Asus T101MT Owners

This post is as much an open poll of those that know their way around hardware as much as it is a warning to others that own an Asus T101MT.

If you have been by my blog recently you may know that I have been going back and forth with Asus support getting an RMA done on my T101MT. I sent the unit in with a bad screen, it got returned had the same issue happen again, it got RMAed again and had the same issue happen a third time. Assuming something was simply wrong with the unit I had beyond repair, I was sent a 100% new unit.

The first thing I do with any new computer I buy is wipe out the default operating system and install Linux. The T101MT was no exception. I installed the latest variation of Bodhi Linux which is powered by the 3.0.0 kernel  and much to my surprise two boots later the brand new T101MT had the exact same screen issue as the previous unit I had.

Now the Bodhi team and I don't do anything crazy with our kernel configurations. In fact, our kernel builds currently come directly upstream from Ubuntu packagers. So if you are using Linux on a T101MT a word of warning - I would not upgrade to the 3.0 kernel any time soon (or a distro that uses it).

Finally, my question to any hardware exerts out there that might be reading this. Is it possible to Linux to cause the internal display of a laptop to stop working (read: It isn't just the GUI or TTYs I can't get up, the system doesn't even post a BIOS screen on it's internal display)? If so, any ideas how or why this could be happening? The unit(s) work 100% fine when attached to an external monitor, so I know the hardware is all working minus the internal screen.

My brain is screaming at me that software should be able to kill hardware like this, but I am running out of debugging options.

EDIT/Update:

I got an external display setup this evening. So I booted it up, logged into the BIOS and cleared the settings back to defaults - poof! My internal display worked again. Well for a few moments anyways.

Every time E starts the internal display in my netbook cuts out if it is the only display attached. In order to get it to come back online again I have to default all my bios settings again. I can tracked the issue back to something E is doing because the internal display does not freak out when using LXDE/OpenBox on the same system. The odd thing is the internal display works fine with E if I have an external screen attached at the same time.


So in short, Enlightenment is doing something that at startup (when only the internal display is active) that disables the internal display at a BIOS level. I've spoken with E developers and the are of the opinion that nothing like this should be possible. I am throughly baffled, will have to run some more tests this weekend...

Any feedback/ideas are welcome and would be appreciated. I'm going to be fiddling around with the units some more in the next week to see if I can figure anything out myself.

Update 2:
This issue is resolved.... Oh me!

Regards,
~Jeff Hoogland




26 comments:

  1. My T101MT is staying on ubuntu 10.10. It's working fine. And I had been thinking about going to bodhi just a little while ago.

    Kevin.

    ReplyDelete
  2. It's not unheard of. e.g. the OS driver for my AMD/ATI graphics card causes it to run really hot and one person has claimed that this cost him his mainboard. see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/563156

    ReplyDelete
  3. Back in the CRT days, it was supposedly possible to blow out a monitor by putting severely incompatible settings into the X11 configuration file. Never seen it happen and isn't something that really CAN happen any more as far as I know, but back in the "configure by hand" days it was supposedly a potential issue.

    I don't know that it has anything to do with an LCD laptop-style screen though.

    ReplyDelete
  4. Have you tried shutting down, unplugging and removing the battery for a few minutes; then hook up the battery and power cord again and try booting up. I've had some kernels interact with some acpi configurations and turn off the lcd backlight at random times. Going through the above procedure, i.e. a full power recycle was the only way to get the screen to come back on.

    ReplyDelete
  5. If this issue is about the kernel causing the problem rather than weak hardware (& personally, I'd blame weak hardware first & delve into the reasons for that occurring- possibly display interface processor), then surely regressing the kernel version would be your first test, though I understand the predicament with the entire piece of hardware needing replacement between attempts and you not being in the enviable position of just going into a parts store to obtain a replacement display. Since it is easy-peasy to test another distro via USB booting, and since your problem occurs so quickly, it would not be beyond a competent repairer to test and isolate such a problem. There are such things as bad batches of "silicon" and whilst they do not occur as often as they used to, (& Samsung self-admittedly had over 50% failure rates before they resolved the problems by appealing to the more advanced and experienced manufacturers) they do still frequently occur (especially with the current dubious quality controls of China- some manufacturers have just quit China rather than further risk their brand's reputation and the impracticalities of delving into the problem in depth). To me, it seems the problem requires some leniency and recognition on the part of your ASUS repairer to "non-windows-esque" users (W-esque=people whom just ignorantly accept BS when they hear it) and ownership of the fact that their systems do not come completely error and risk-free; that a. their hardware is built to a competition-driven price and may come with "issues", and b. that they represent a manufacturer that has spent millions on R&D to improve their reputation in an industry where one user's bad experiences can easily be communicated globally. I remember the days when the Jaz2 (2GiB) drive came out (I was using Amigas at the time), and it ALWAYS failed me whenever I did a low-level SCSI format (first thing I did); I remember RMA'ing 6 disks before returning the drive for my refund. The technician I bought it through, suggested that I stop testing it's integrity with a low-level format "before relying on it for archival purposes" (my words), and just use it like everybody else whom didn't complain of a problem. Needless to say, I didn't use his services again, and hey, I was in a different city, and naively didn't expect that level of incompetence out of a person with an IT background. That was my first AND ONLY use of Jaz, ever, and I quickly reverted to a Syquest and had no issues with those (though many on PCs did- go figure).

    ReplyDelete
  6. Could be ACPI, which Microsoft invented as a sabotage for gnu/linux. The chipset may be reporting a clock speed that will fry it. The only way to prove malice is to check the tables to see if it reports the same thing to gnu/linux as it does to Windows. The tables won't rule malice out however, if the Windows drivers are intentionally set to ignore ACPI reporting, which would be a form of planned obsolescence because the laptop will fail on Windows "upgrade" later.

    That said, I set my screen savers to blank screen or none to avoid stressing my systems. Chipsets should be able to run those 24/7 but I'd rather not find out.

    ReplyDelete
  7. Might be BIOS related.
    I have an Asus and it is sometimes acting really weird (Some buttons get stuck don't register anymore or automatically. It survives reboots and it happens in Linux and Windows. To "repair" it I need to enter the BIOS and simply save and exit). In your case the question is who is at fault? I'd guess it is the hardware (either bad parts or not not sticking to some standard).

    ReplyDelete
  8. Jeff,

    When a laptop does not even post a BIOS screen, you have a hardware problem - period. Odds are very high that the cable going through the hinge, or the hinge mechanism is at fault.

    If you did not have external video either, I would suspect a faulty / fried video card. This was happening a few years back with Nvidia chips that were connected to the heat sink with double sided foam tape (Duh HP, Dell, etc.)

    The three laptops that I own all suffered from the chip overheating problem. On the Dell's (E1705), I replaced the removable video cards with ATI and they have given me no trouble since. On the HP, I reflowed the motherboard in an oven at 400 degrees F, then replaced the sticky tape with a copper slug. None have given me problems since.

    Good Luck!

    ReplyDelete
  9. If you can't post BIOS, you may have fried the screen controller chip, downstream of the GPU. In that case you would still be able to work on an external monitor as it branches our before the panel adapter.

    If you can check the BIOS on the external monitor, see if it has internal checks. See if you can download a chipset diagnostic utility from ASUS.

    I had this issue back in the days of manually configuring resolution and refresh rates for CRT's where the CRT feedback locked up my video card. I managed to reset its BIOS and get back in business.

    Software can't fry hardware, but it can overdrive it and expose inherent weaknesses.

    Good luck!

    ReplyDelete
  10. This might not be related but I was having strange video freezes and sudden system reset issues while dual booting Bodhi and Win XP on an Asus PN5-E board. Take Bodhi out of the picture and the problems ceased. If I only cold booted to XP there would be no problem, but if I booted to Bodhi first, a reboot to XP would reset the system about 5 seconds into the welcome screen and occasionally the system would reset just after the BIOS post. My Quadro FX card was also briefly showing some strange green squares on an otherwise black screen during boot, which are now gone. Both Fedora 14 and Fusion linux (not Ubuntu, I know) don't cause this issue with the exact same hardware.

    ReplyDelete
  11. My experience of Ubuntu killing hardware at least has a reasonable explanation. Two different laptops' speakers were killed by one of the older versions of Ubuntu. (I forget the version.) For some reason the shutdown beep was EXTREMELY loud in that version, and could not be controlled by the user. It was a "bug" complained about by many users. Prior to finding a fix to disable the shutdown beep, both of my laptops' speakers fried. I was able to replace just the speakers in one myself, and the other machine was having another hardware issue, so the whole MB was replaced under warranty, which included the speaker. So it is beyond a theoretical possibility that software can kill hardware, but in my case the explanation is obvious. Speakers can be fried by too high of volume.

    ReplyDelete
  12. Have you tried connecting an external; monitor? It may be that Linux is just inadvertantly switching off the internal monitor?

    With a desktop you would get a post message if you fried the graphics card ( in the form of a series of beeps ). I think the same thing would happen on a laptop.

    Can you ssh into the laptop ( about five-ten minutes after you turn it on )? If so can you copy the logs and examine them?

    ReplyDelete
  13. sounds similar to a HP laptop that had one of that Nvidia chips with a production flaw. Basically the gpu would fry thanks to overheating.

    ReplyDelete
  14. Check the edit/update if you care to give more input.

    ReplyDelete
  15. Sounds like something in the OS is changing a bios setting that disables the internal monitor or its backlight. If the problem does not occur with other Linux distros, its time to switch distros.

    ReplyDelete
  16. OK. This si a dumb question, but usually laptops have a function key to turn the monitor on and off. Have you tried that key? It's not a fix but it could be a workaround.

    The rest depends on how much work you want to put in.

    The first thing I would let the system die. Then without rebooting plugin the external monitor. ( BTW does the system die if the external monitor is connected but not enabled?
    ).

    If the external monitor works after plugin, is E still running or has it crashed?

    What if you boot with both external and internal monitor then disconnect the external monitor?

    Have you read the system logs? Look in particular for EDID entries.

    Also you might want to log an strace -f of E ( not in /tmp ! ). See if you can't find some signal killing some subprocess.

    ReplyDelete
  17. Tried the function key already :-/

    ReplyDelete
  18. I have similar problem on my desktop sometimes. When I boot, my ASUS motherboard beeps few times and nothing appears on the screen, not even BIOS (despite that, behind this black curtain everything is working as usual). When I connect monitor through the second cable (I have DVI and D-Sub on ATI graphic card and monitor, so I can connect both at the same time) it automagically starts to work. After disconnecting D-Sub, it doesn't work once again. After some use it repairs itself, so I can disconnect D-Sub. Interesting is, when I keep two cables connected, after some time it breaks too. I must test it once again, I use PCLinuxOS with E17 sometimes, maybe really this is the reason?

    ReplyDelete
  19. Check it and let me know please petrek!

    ReplyDelete
  20. Ok, I checked few things already. After using E17 (with PCLinuxOS and Bodhi 1.1.0, I don't have 1.2.0 version yet) everything worked fine, but I checked only in live session. I get the same error beep only when I remove graphic card from PCI-E. Now few possibilities come to my mind. First, are you sure that this netabletbook is 100% new? How do you check this? If it is, how do you install Bodhi? Try to install from different USB keys, if you didn't. Strange as it sounds, even when md5sum is correct, the usb and iso can be broken (In my case, after installations without changing partitions grub always hanged. When I was erasing whole disk during installation it was fine. I checked md5sum of my usb key few times, even after installation, and it was the same as on the web page. After installation from new usb and DVD everything worked like a charm. And I remember that sometimes I had errors on PCLinuxOS CD, now I test it from good CD). Also, I wouldn't trust kernel too much, it can be some regression, or even malice in it, I read lately that there was a silent attack on kernel.org, and source was changed, don't know for how long, so it could spread some bad code. Do you have the latest, password protected, BIOS? Back to testing :)

    ReplyDelete
  21. Can Linux Kill Your Hardware?
    Absolutely positively YES. Back in early 2000 I had thinkpad 770z and during endless investigations on how to make everything work I came across a warning about some ACPI modules (IIRC) that would brick your thinkpad by writing something into non-volatile memory (or some such, don't remember details). Later there was a patch added to disable those modules if system was identified as thinkpad during boot. It was clearly reflected in dmesg. So yes new kernel can be screwing your hardware.

    ReplyDelete
  22. Oops! My bad. Not ACPI but lm-sensors.
    http://www.thinkwiki.org/wiki/Problem_with_lm-sensors

    ReplyDelete
  23. Not a kernel issue. Rolled back to several different kernels that I know worked before and still have the same issue.

    ReplyDelete
  24. Clearing CMOS, reflashing BIOS might help.

    ReplyDelete
  25. Bodhi 1.2 with kernel 3.0.0-9-Generic
    T101MT 2nd Gen Screen (Intel N10 Family Graphics Controller (rev 02) ?) and calibrator installed. Touchpad worked OOB, except for multitouch, function key shortcuts and rotate button.

    installing ppa:chasedouglas/multitouch did not work, still single touch, which I could live with if only I could recover touch screen's 1/2" bottom section that stopped responding even after reinstalling other versions.

    I am starting to believe it's a BIOS problem, probably even related to Fast Bootup which I'm keeping now disabled, at least until I have a fully operational system.

    I just can't find the right setup, even though I made it work once with Ping-Eee_OS 11.04 but didn't take note how, even Fn-F4 was giving me 1024x768 screen resolution and touch-rotate. But decided to wipe off Windows partition and have a Tablet friendly OS such as Bodhi 1.1. If only the bottom portion of touchscreen worked again...

    I've read that Utouch works on 10.10 but do not know how to install with Bodhi 1.2 or if there is any other solution for this 2nd Gen. machine.

    Any help and ideas would be much appreciated. Probably upgrading or downgrading the BIOS or Kernel 3.0.0.9?

    Jeff, have you had any luck lately with your new T101MT? Is it 2nd Gen.? What do you think...? And, by the way congratulations on such a nice OS as is Bodhi, if we could just make our little machines fully operational on it...

    Thanks. Victor

    P.S.
    Following are some specs and machine specific info...

    smod | grep hid:
    hid_multitouch 12851 0
    usbhid 41905 1 hid_multitouch
    hid 77367 2 hid_multitouch,usbhid


    lspci:
    00:00.0 Host bridge: Intel Corporation N10 Family DMI Bridge (rev 02)
    00:02.0 VGA compatible controller: Intel Corporation N10 Family Integrated Graphics Controller (rev 02)
    00:02.1 Display controller: Intel Corporation N10 Family Integrated Graphics Controller (rev 02)

    lsusb:
    Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
    Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
    Bus 003 Device 002: ID 0486:0186 ASUS Computers, Inc.
    Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
    Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
    Bus 001 Device 004: ID 13d3:5126 IMC Networks
    Bus 001 Device 003: ID 058f:6366 Alcor Micro Corp. Multi Flash Reader
    Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

    ReplyDelete
  26. OK, so it's probably not the Kernel. How about the BIOS? Has anyone tried already reflashing to an updated BIOS? I also haven't heard any word on this screen problem from ASUS despite being a recurring issue being discussed on several other blog posts, including ASUS', that a portion of the screen, either top or bottom is rendered useless.

    Jeff, I also noted that some times, specially if I reboot instead of just Shutting Down the machine, during reboot I just get a black-dimmed screen which I correct either by pressing F2 to bring up the BIOS screen or by pressing Fn-F6 which then increases the screen's brightness.

    UPDATED:

    Could all this settings (brightness, screen calibration, etc.) be changed at BIOS level? Or even have the Boot Booster 16M EFI partition something to do with it? How about ACPI settings interfering with the BIOS and Boot Booster?

    Victor

    ReplyDelete