Investigating a dead TouchBar

Today my TouchBar finally died...

I own a second-generation MacBook Pro 15' with the TouchBar (mid-2017 model).
As many, I can't say I'm a big fan of this TouchBar gadget.
As a developer, spending most of my time coding, or doing stuff in the Terminal and in vim, I miss real function or escape keys.

That being said, having TouchID on a Mac is awesome (T1 ftw), plus I usually use an external keyboard, so that's not really a big issue.
And of course, the machine itself is simply splendid.

Unfortunately, the TouchBar never really worked on this machine.
Most of the time, it required multiple reboots simply to power-on. TouchID was working fine, so this was more a display issue.
Also, I was running the beta versions of macOS High Sierra, so I tought of a software-related issue.

But now it looks like its definitely dead. No matter what I try, the TouchBar does not power-on.
It's just a useless black ribbon, at the top of my keyboard.

Booting in diagnostics mode reports no issue.
And no luck resetting the SMC and PRAM either.

So I tried to investigate at the software level.

The TouchBar runs an operating system of its own, apparently a variant of WatchOS, on a dedicated chip.
It communicates with macOS through a system service called TouchBarServer, used by the ControlStrip application (in /System/Library/CoreServices).

Now while there's a TouchBarServer process running on my machine, there's no ControlStrip process.
This is obviously an issue.

I tried to start the ControlStrip process manually, but that doesn't work.
So I thought I could try debugging it with LLDB.

lldb /System/Library/CoreServices/

Running the process gives the following output:

Process 688 launched: '/System/Library/CoreServices/' (x86_64)
Process 688 stopped
* thread #1, queue = '', stop reason = signal SIGABRT
    frame #0: 0x00007fff7c0dbe4e libsystem_kernel.dylib`__pthread_kill + 10
->  0x7fff7c0dbe4e <+10>: jae    0x7fff7c0dbe58            ; <+20>
    0x7fff7c0dbe50 <+12>: movq   %rax, %rdi
    0x7fff7c0dbe53 <+15>: jmp    0x7fff7c0d31e8            ; cerror_nocancel
    0x7fff7c0dbe58 <+20>: retq   
Target 0: (ControlStrip) stopped.

Looks like abort is called somewhere, for some reason.
The backtrace gives:

* thread #1, queue = '', stop reason = signal SIGABRT
  * frame #0: 0x00007fff7c0dbe4e libsystem_kernel.dylib`__pthread_kill + 10
    frame #1: 0x00007fff7c21a150 libsystem_pthread.dylib`pthread_kill + 333
    frame #2: 0x00007fff7c038312 libsystem_c.dylib`abort + 127
    frame #3: 0x000000010004f07d ControlStrip`___lldb_unnamed_symbol1597$$ControlStrip + 61
    frame #4: 0x0000000100045aca ControlStrip`___lldb_unnamed_symbol1532$$ControlStrip + 490
    frame #5: 0x0000000100045aff ControlStrip`___lldb_unnamed_symbol1533$$ControlStrip + 15
    frame #6: 0x00007fff51946d62 AppKit`-[NSClassSwapper initWithCoder:] + 584

Let's try to break on the first function before abort:

b ControlStrip`___lldb_unnamed_symbol1597$$ControlStrip

After the usual prologue, we can see the following stuff, in the disassembly:

0x10004f046 <+6>:  callq  0x100079a92               ; symbol stub for: DFRCreateCAContext
0x10004f04b <+11>: movq   %rax, %rdi
0x10004f04e <+14>: callq  0x100079fb4               ; symbol stub for: objc_retainAutoreleasedReturnValue
0x10004f053 <+19>: movq   %rax, %rbx
0x10004f056 <+22>: movq   %rbx, %rdi
0x10004f059 <+25>: callq  0x100079f9c               ; symbol stub for: objc_release
0x10004f05e <+30>: testq  %rbx, %rbx
0x10004f061 <+33>: je     0x10004f06a               ; <+42>
0x10004f06a <+42>: leaq   0x31611(%rip), %rax       ; "TouchBarServer not running."
0x10004f071 <+49>: movq   %rax, 0x57a28(%rip)
0x10004f078 <+56>: callq  0x100079ed6               ; symbol stub for: abort

So the ControlStrip calls a DFRCreateCAContext function, which fails, returning 0. It then calls abort, with a TouchBarServer not running. message.

I already know the TouchBarServer process is running, so I can try making this test pass py placing the value 1 in the rbx register:
This way, I'll avoid the call to abort:

p $rbx=1

The process indeed continues normal execution, outputting a ton of AutoLayout issues (c'mon Apple...).
But obviously, the display stays dark...

Now I'm convinced this is an hardware issue.
I also tried to reinstall a fresh copy of macOS High Sierra, and the problem persist.

Looking at the system logs, I can also see:

[DFR] [DFRDisplayRegisterForNotification_block_invoke] AddTerminatedNotification ret = 0x0
[DFR] [deviceTerminate] 
[DFR] ERR [_DFRDisplayHandleVendorPacket] DFR display not ready, possible hardware error
[DFR] ERR [GetInfoTimeout] get info timeout, retrying
[DFR] ERR [_DFRDisplayHandleVendorPacket] DFR display not ready, possible hardware error

So yeah, I think I'm left with bringing this puppy to the nearest Apple Store for an exchange...
Fragile little things...

Investigating a dead TouchBar

Jean-David Gadina
11/01/2017 21:07
Copyright © Jean-David Gadina
This article is published under the terms of the FreeBSD Documentation License.

Similar posts