Also Skype refuses to launch. I reinstalled it several times. Launched it from different locations.

I have issues with Safari too. Sometimes it fails. Sometimes it doesn't start. I ran DiskUtility, I had file system issues. I fixed them.

So now I'm thinking this is from the memory or from the ssd. OCZ have no utility to check the drive status. I don't know how to check whether any files are destroyed or changed. I ran a memtest console tool to check the memory when I bought it - it was okay.

So now I'm looking for ways to diagnose where the problem is and fixing it.

In testing, some more chances to capture screenshots have provided these examples of the failure:

I have an external HDD, I'm using Carbon Copy Cloner to make a copy of my drive for backup issues. I used it as a startup drive - the issues persisted - Skype refuses to start (the process takes 100% CPU), Safari starts but doesn't work (100% CPU).

I have two graphics cards - Intel HD Graphics 3000 and AMD Radeon HD6750. I have a tool that manually forces the use of only the first one - to save battery. Now I switched to the other one for a little while and the glitches stay the same.

I have old RAM - original 4GB (Hynix) and 8GB (Crucial). I changed back to the 8GB and Skype starts first, but if I stop and then start it again we're back to where we were. With the 4GB (currently I'm with it) - it even didn't want to start. So now it looks like the problem might be with the ssd?

I noticed App Store and Software Updates fail to connect to the internet.

I went into safe boot (press Shift during restart) - now everything works - skype starts everytime, software update works (I updated lion to the most up to date version). I changed the memory back to the 16gb and tested again into Safe Mode - again everything works. Now everything seems like a software problem.

Most of the times Skype starts once correctly. When I restart - it fails until I restart. Could skype be messing up my machine?

Old version of Skype works - this is a temp solution.

I saw a solution where Little Snitch was blocking Software Update and App Store. I stopped my Little Snitch and they started working again. So these are fixed. The new skype still fails (sometimes the first time it works). Safari fails too. The glitches in the interface don't show up anymore. Everything works in Safe mode.

I'm researching what is different in Safe Mode and cannot find such a good answer.

I removed all software that had kext files installed (Virtual Box, Little Snitch, an antivirus).

Did I mention that I ran 2 free antivirus softwares that failed to find anything?

My TimeMachine had a glitch and said it needs to create a new backup like it was just started for a first time. Now I see only the backups for the last day even though the old backups seem to be there (same space used in Time Machine's hdd).

I removed HWNetMgr, HWPortDetect and StartOuc startup items from /Library/StartupItems. They were from a Huawei 3G modem. In blog it was said that this would speed up my shutdown time. Now shutdown goes from 10 down to 1 second.

Startup time seems quite long. For a machine like this it is like 15 - 20 seconds.

The only remaining /Library/StartupItem is a wireshark's ChmodBPF. I removed it too - didn't help.

/System/Library/LaunchAgents gives a lot of com.apple stuff, the only non-apple things are org.x.startx.plist and org.openbsd.ssh-agent.plist. Full list.

I used plutil to "check" all plist files. /Library/LaunchDaemons/*.plist, /Library/Preferences/*.plist and ~/Library/Preferences/*.plist are fine.

Ran fsck -fy: "Incorrect number of extended attributes" and "Invalid leaf record count". I fixed them. DiskUtility said nothing.

ran fsck_hfs -f /dev/disk1s2 on the external drive. There are errors while the drive is read-only (mounted), no errors while it is not mounted. But no errors produced, I guess it says there are errors because it cannot read the drive.

Google Drive's app starts its process, but there's no window, cpu is 100%. The same behavior with Chrome (even though there's a window, WebProcess behaves the same way). The same with Skype. What's the common between them? an API?

I launched from the cloned CD and ran fsck_hfs -f on the SSD. It found no errors.

DiskWarrior found a lot of issues and fixed them all.

I ran the Apple's Hardware Test (AHT) from the original CD. No errors. I'll run the extensive test tonight. Still nothing.

The external hdd (used for cloning) stopped working. Is this the perfect storm? Now I have to recopy everything. This with the non working TimeMachine could mean that I can loose my data? What's going on?

I installed OS X clean on an external drive - all worked - Skype, Safari. All was fine.

Finally progress. I reinstalled OS X Lion on top of the SSD. It turns out that the installer installs on top of the existing system, so I didn't have to migrate anything. Now Safari works, Skype 5.x works, Eclipse works. Interestingly when I started Eclipse it said Java is not installed and it installed it. I didn't know the installer worked this way. I'm using the 8GB ram to make sure all is fine. I'll monitor carefully what is going on. I'll stop writing here until a problem occurs.

Problem again. I updated my machine with Software Update and now it fails again.

I reinstalled. Now Skype works, Safari Works, VMWare Fusion works, SkyDrive fails, There are a few glitches, but this is the best state that my computer has been in a while. I'm not going to update, because I think that will do the damage.

SOLUTION (SO FAR): Then again all failed again. At some point even the laptop stopped recognizing the SSD as a drive. Then it started recognizing it again. I was so tired of all this, that at the end I was thinking of buying the new Macbook, but it being not easy to upgrade and actually not being any faster than what I have now made me think twice. I finally reinstalled it from scratch. I thought it would take weeks before I reinstall all the apps I had, but it took 2 hours, which was amazing - I just put all the apps in a list and some custom configurations too. Now it works and I hope it stays that way. I'm with the 16gigs of memory and the ssd drive. I hope this horror will never occur again. It actually looks like it was an ugly software bug maybe caused by something else. I actually don't care and don't want to know. I just don't want to experience it again. I then reinstalled my other macbook too and my iPhone. All of them now work faster and better. I guess OS X (and iOS) reinstallation can be as fulfilling as Windows reinstallation (which is definitely not a compliment). I just hope everything stays working. If it doesn't I'll just buy a new laptop (and I am thinking of a non-apple one).

As an epilogue: I learned a lot about the system, but then again I bought the MacBook so that I wouldn't have to learn. I bought it for the it-just-works experience and I didn't get it. All of the diagnostics failed. Finally the solution was the Microsoft Windows one - a clean reinstall.

A year later: the laptop had major hardware issues which were unsolvable. Now I'm happy with a new Air.

I would start with RAM - that sort of graphical anomaly is a GPU frame corruption issue - so the things you can change are making sure the firmware is updated, software updated and swap out the RAM. If that doesn't work, you might seek service to have the logic board/GPU tested.
–
bmike♦May 10 '12 at 22:57

Remove the second ram stick and test. Then put the second chip in the first slot and test. Finally put the first in the second's slot. (and test) This will allow you to determine if either chip or either slot can be ruled in or out as a cause of the annoyance.
–
bmike♦May 13 '12 at 22:09

2

Symptoms of a bad SSD typically involve freezing or kernel panics, or general performance issues. Bad RAM is unlikely but could manifest in all sorts of ways, so as was suggested by @bmike, run MemTEST and see if the RAM fails. My gut tells me it's neither though.
–
cksumMay 14 '12 at 16:20

2

I've encountered a very similar issue on a machine without an SSD. I believe it was the logic board in my case, but this was a couple of years back. I've got a whole folder like this: i.stack.imgur.com/IaHXF.png
–
alexmullerMay 16 '12 at 18:01

2 Answers
2

The best thing with troubleshooting is to isolate the issue and keep good notes when the issue comes and when it goes.

Once you also have an understanding how to make the issue ( in your case, is that glitch constant or does it come and go ) it is then very easy to systematically isolate things.

In your case, switch the ram to the opposite slots, run with only one stick, then the others. Try to find out if it's the ram slot or the motherboard or the ram itself. Then you can isolate the sad by running a while on an external drive.

Repair technicians are very familiar with this process, and due to the volume of work they do, have better feel for what fails more often, have the tools to test booting your Mac from a clean OS externally, etc.

So even though you may not be as fast, skilled or familiar with troubleshooting by isolation, you can still use the same methodology to isolate this failure on your Mac.

As an Apple Certified Macintosh Technician, I endorse bmike's comment. Very good advice.
–
Dan BarrettMay 11 '12 at 9:11

I'm doing it, but this takes a lot of time. I guess I'm asking if there is someone that had my issues so they can share their experience.
–
mistMay 11 '12 at 9:41

1

I have seen this issue on 50+ macs. Statistically RAM is the most likely candidate, but it can be GPU and it can be software corruption (or utilities like the GPU switcher you mentioned in your update). People can guess the cause, but you don't really care what the odds are - you want to know what is causing your symptom. As to your CCC test, it won't help. Instead, you want to erase your external HD and install a clean OS and test there. Don't move anything from the suspect system - instal everything from a clean download and you'll know quickly if it's software or hardware.
–
bmike♦May 11 '12 at 12:30

Thank you @bmike. You answered two questions I was interested in - "Has anyone seen this?" - you said 50+ cases. And "can my system files be corrupted?" - you said yes, because you want me to do a clean install.
–
mistMay 11 '12 at 14:16

1

I never use a tester when until after I've already tried to isolate the RAM itself. Testers can flag bad ram, but not always. It depends on the first two paragraphs in my troubleshooting process. Can you make the glitch happen on command or know how long it takes to appear? If so - start with the ram removal process. That works far better than testing RAM in my experience. Again - it's a guess, so you have to start somewhere and move forward.
–
bmike♦May 11 '12 at 16:16

First, make sure you are using ECC RAM with appropriate specs for your system. Then check if you are getting ECC Errors (About This Mac -> More Info -> [System Report if on Lion -> ] Memory). If there are no memory errors reported, it is not likely to be a RAM issue.

Based on my experience, the visual anomalies you show are most likely either a graphics card issue or a graphics card RAM issue, but that would not explain the failure of Skype to launch. Bad RAM would explain the failure of Skype to launch and the visual anomalies, so it would be my first guess, but if there are no memory errors reported, then I would expect something more seriously wrong with the motherboard.

The only way I see it being a problem with the SSD is if it were drawing so much power or creating so much noise on the power bus that it was interfering with proper operation of the motherboard.

If motherboard, GPU issue - why is everything working in Safe Mode? I did an extensive AHT test and it said all is fine.
–
mistMay 16 '12 at 10:49

@Mihail, "The laptop experienced two crashes in Safe Mode, which destroys all my previous assumptions that this is a software issue." So you cannot say everything is working in Safe Mode.
–
Old ProMay 16 '12 at 18:57

2

@Mihail, I forgot that laptop memory doesn't have ECC. Without ECC, it's very hard to detect memory errors, so the fact that there are no memory errors reported is not significant. Are you positive you have the right type, speed, CL, and buffering of memory? I would stick with the Apple memory that came with the laptop while testing.
–
Old ProMay 17 '12 at 22:22

I don't know if it is the right one. The Apple memory has the same issues too.
–
mistMay 21 '12 at 21:15