Monday, September 10, 2007
how I solved a significant Dell XPS 710 / Vista problem
I have blogged before about my problems with my work laptop, a Lenovo ThinkPad X60 Tablet, but not so much about the probably more serious problems I've been struggling with on my relatively new (May 2007) Dell XPS 710. The XPS 710 is both the most expensive home computer I've ever bought and the most problematic. Put succinctly, it frequently locks up, hangs, freezes -- whatever you want to call it -- and I have to hard power it off. As I was using this blog entry as research to build my tech support call, I noted the following symptoms:
- Things weren't too bad until I applied the latest series of patches, specifically the two Performance and Reliability patches
- Lockups have occurred while searching from the Start Menu and choosing See All Results
- Often a lockup would start with one app, and then spread to the rest -- I've seen this with things like filemon on multiple Vista machines and figured that was a cause, but at work I've seen Outlook and OneNote hold each other open and then get all the other apps to join their freeze-up party
- I had more than one lockup when changing "Use this folder type as a template" and checking "Also apply this template to all subfolders" (as an aside, I wish I could default this across the board to "All items" -- what can I say, I'm just used to sorting by date)
- Maybe it was a bad hard drive -- I began to notice Event ID 129 warning messages from nvstor32 in the event viewer stating "Reset to device, \Device\RaidPort0, was issued." Often these preceded the lock-up.
- I also saw ACPI Event ID 6 messages stating "IRQARB: ACPI BIOS does not contain an IRQ for the device in PCI slot 2, function 0." Besides slot 2, this also happened with slot 5, slot 4, slot 19, and slot 24.
- Switching in and out of full-screen mode in Unreal Tournament 2004 could lock up the machine
This latter led me on a quest that I think was a dead end but I'm not entirely sure. When I first bought the PC and ran it for a number of days straight I had the occasional lock-up. This seemed to go away, maybe with successful nVidia (graphics) driver updates, but after applying the Performance and Reliability patch (I think) it caused the lockups again, and so I started looking at video card temperature as a possible reason. I downloaded MonitorView and nTune and found that my nVidia 8600 GTS did in fact seem to be running pretty hot (124°F from a cold start in 2D mode), and also noticed that the fan was at its lowest setting. Using nTune, I could set the fan to the maximum and achieve a temperature of 115°F in 2D, though it would increase to 130°F or so during games. Turning the fan up and decreasing the GTS's temperature seemed to reduce lockups, though I would still have a few a day. And nTune was annoying, because it couldn't automatically set the temperature, so I put it in Startup and manually set it every time. What a pain in the ass to set this four or whatever times a day, and still have lockups.
After some searching on the nvstor32 129 errors, I found this MS Forums post and follow-on searches turned up this blog entry, which pointed me in the right direction. I turned off Native Command Queuing in the "nVidia nForce Serial ATA Controller" for my hard drive to see if that helped, and so far I have had my computer on for almost 30 hours without a freeze and reboot, whereas it hadn't lasted two hours for the last few weeks. I assume that by posting this I tempt fate and perhaps my computer will "hear" me and begin to freeze again, but hopefully others who have this problem will see my blog post and have similar success.
Now to figure out what exactly is Native Command Queuing, and find out why it would cause such a problem and why it only became such a problem recently. Did a Performance and Reliability patch turn it on? Did some nVidia driver update do it? Why was I plagued with this? Would Dell have known if I had just called them? So many questions, so little time in the day to answer them…
My system is home built but it does use the same nforce sata controller under vista. The problem started a few days ago but i cant pin it down to any particular install.
I have also read the solution to turn off NCQ but have yet to try it.
NCQ is a feature of new SATA drives which boosts performance considerably so it would be a shame to permanently disable it.
Hopefully nvidia/MS can get to the bottom of this.
Links to this post: