Development and testing of modified XNU Kernels for AMD CPUs on OS X
by: Shaneee
Maamoun wrote:Hey
can i help in emulating SSE4.1 / 4.2 Instructions? I'm skilled in C++ and ASM so can anyone get me started and point me out to the places where i can start? thank you.
Sure that would be great. Take a look here, viewtopic.php?f=14&t=1150


by: Maamoun
I put my hand upon SSE instructions, actually it's AVX2 instructions. It's in osfmk/x86_64/lz4_decode_x86_64.s assembly file, this file didn't exist in previous versions. However it's self-emulated, at compile time the compiler choose AVX2 instructions. If we add this line:
Code: Select all
#undef __AVX2__
at the top of file, the compiler will choose old SSE instructions.
by: Shaneee
I'll try build a kernel with that then and send it for testing on an older CPU.

Edit: It didn't make a difference.
by: Maamoun
This is the only sse code in the kernel, I don't know if the old sse instructions also don't work or "#undef __AVX2__" have no effect, the problem I don't have macOS to do some tests however I'm in discord for discuss that.
by: Shaneee
The issue seems to be with SSE4.1 or 4.2. There was some added to the opcode emulator but not a lot. See here,
AnV Added a few SSE42 instructions
Code: Select all
by: Shaneee
I'm not sure how to go about adding them. Netkas found some SSE4 instructions outside the kernel,
I found some sse4,1 opcodes in libraries, not in kernel.

otool -Vvt libobjc.A.dylib | grep roundss
000000000001c9dc roundss $0xa, %xmm0, %xmm0
000000000001cb28 roundss $0xa, %xmm0, %xmm0

otool -Vvt libbsm.0.dylib | grep pmovsxdq
000000000000e457 pmovsxdq (%rbx), %xmm0
000000000000e461 pmovsxdq 0x8(%rbx), %xmm0
000000000000e4fa pmovsxdq (%rbx), %xmm0
000000000000e504 pmovsxdq 0x8(%rbx), %xmm0
by: Maamoun
If there are not so much of libraries that use these instructions, I suggest patching them. I can do that using IDA.

Edit: There are too much of them, that's impossible except if someone tell us which the libraries that cause the problem.
by: Shaneee
Yeah there's a lot. This is a list of them all,
Code: Select all
analyzing libChineseTokenizer.dylib
0000000000001e3e roundss $0xa, %xmm0, %xmm0
0000000000002199 roundss $0xa, %xmm0, %xmm0
00000000000026de roundss $0xa, %xmm0, %xmm0
0000000000003253 roundss $0xa, %xmm0, %xmm0

analyzing libFosl_dynamic.dylib
0000000000112368 roundss $0x9, %xmm0, %xmm0
0000000000112428 roundss $0xb, %xmm0, %xmm0
00000000001125ba roundss $0x9, %xmm2, %xmm0
00000000001125fa roundss $0xa, %xmm0, %xmm2
00000000001126e8 roundss $0xa, %xmm0, %xmm0
00000000001127a8 roundss $0x9, %xmm0, %xmm1
0000000000112873 roundss $0x9, %xmm2, %xmm2

analyzing libIASUnifiedProgress.dylib
00000000000011f0 roundss $0xa, %xmm0, %xmm2

analyzing libate.dylib
000000000000c5ef roundss $0xb, %xmm5, %xmm4
000000000000c5fd roundss $0xb, %xmm0, %xmm0
000000000000c60b roundss $0xb, %xmm3, %xmm3
000000000001f4f1 roundss $0x4, %xmm0, %xmm0
000000000002b01a roundss $0x4, %xmm15, %xmm0
0000000000044d82 vroundss $0xb, %xmm4, %xmm0, %xmm4
0000000000044d90 vroundss $0xb, %xmm0, %xmm0, %xmm0
0000000000044d9e vroundss $0xb, %xmm3, %xmm0, %xmm3
00000000000581c5 vroundss $0x4, %xmm1, %xmm0, %xmm1
0000000000060609 vroundss $0x4, %xmm0, %xmm0, %xmm0
0000000000070747 roundss $0xb, %xmm5, %xmm4
0000000000070755 roundss $0xb, %xmm0, %xmm0
0000000000070763 roundss $0xb, %xmm3, %xmm3
0000000000083649 roundss $0x4, %xmm0, %xmm0
000000000008ec04 roundss $0x4, %xmm15, %xmm0

analyzing libdtrace.dylib
000000000004086f roundss $0xa, %xmm0, %xmm0
0000000000040c49 roundss $0xa, %xmm0, %xmm0
000000000004106a roundss $0xa, %xmm0, %xmm0
000000000004210b roundss $0xa, %xmm0, %xmm0
0000000000043d1e roundss $0xa, %xmm0, %xmm0
00000000000447f8 roundss $0xa, %xmm0, %xmm0

analyzing libmecabra.dylib
000000000003120d roundss $0xa, %xmm0, %xmm0
0000000000031414 roundss $0xa, %xmm0, %xmm0
000000000003160e roundss $0xa, %xmm0, %xmm0
0000000000087f88 roundss $0xa, %xmm0, %xmm0
0000000000099d92 roundss $0xa, %xmm0, %xmm0
000000000009ac7a roundss $0xa, %xmm0, %xmm0
000000000009ad5e roundss $0xa, %xmm0, %xmm0
000000000009aec2 roundss $0xa, %xmm0, %xmm0
000000000009b5ea roundss $0xa, %xmm0, %xmm0
000000000009be4a roundss $0xa, %xmm0, %xmm0
00000000000a18de roundss $0xa, %xmm0, %xmm0
00000000000a1b10 roundss $0xa, %xmm0, %xmm0
00000000000f893a roundss $0xa, %xmm0, %xmm0
00000000000f8b32 roundss $0xa, %xmm0, %xmm0
00000000000f8d14 roundss $0xa, %xmm0, %xmm0
00000000000fa17e roundss $0xa, %xmm0, %xmm0
00000000000fa94e roundss $0xa, %xmm0, %xmm0
00000000000fd2f8 roundss $0xa, %xmm0, %xmm0
0000000000102031 roundss $0xa, %xmm0, %xmm0
000000000010228e roundss $0xa, %xmm0, %xmm0
0000000000112b6b roundss $0xa, %xmm0, %xmm0
0000000000112dde roundss $0xa, %xmm0, %xmm0
0000000000113509 roundss $0xa, %xmm0, %xmm0
000000000011387a roundss $0xa, %xmm0, %xmm0
000000000011f0c0 roundss $0xa, %xmm0, %xmm0
000000000012383c roundss $0xa, %xmm0, %xmm0
0000000000126140 roundss $0xa, %xmm0, %xmm0
0000000000126990 roundss $0xa, %xmm0, %xmm0
000000000012a26c roundss $0xa, %xmm0, %xmm0
000000000012bf64 roundss $0xa, %xmm0, %xmm0
000000000012d204 roundss $0xa, %xmm0, %xmm0
0000000000134a2d roundss $0xa, %xmm0, %xmm0
0000000000135d6f roundss $0xa, %xmm0, %xmm0
00000000001393fa roundss $0xa, %xmm0, %xmm0
000000000013958e roundss $0xa, %xmm0, %xmm0
0000000000141591 roundss $0xa, %xmm0, %xmm0
00000000001417d2 roundss $0xa, %xmm0, %xmm0
0000000000143362 roundss $0xa, %xmm0, %xmm0
0000000000143bcf roundss $0xa, %xmm0, %xmm0
0000000000143f48 roundss $0xa, %xmm0, %xmm0
000000000015b769 roundss $0xa, %xmm0, %xmm0
00000000001626a2 roundss $0xa, %xmm0, %xmm0
0000000000163b78 roundss $0xa, %xmm0, %xmm0
0000000000164e3e roundss $0xa, %xmm0, %xmm0
00000000001705ab roundss $0xa, %xmm0, %xmm0
0000000000172392 roundss $0xa, %xmm0, %xmm0
00000000001752e4 roundss $0xa, %xmm0, %xmm0
000000000017550e roundss $0xa, %xmm0, %xmm0
0000000000177343 roundss $0xa, %xmm0, %xmm0
000000000017753c roundss $0xa, %xmm0, %xmm0
0000000000177735 roundss $0xa, %xmm0, %xmm0
000000000017792e roundss $0xa, %xmm0, %xmm0
0000000000177b30 roundss $0xa, %xmm0, %xmm0
0000000000177d29 roundss $0xa, %xmm0, %xmm0
0000000000177f2b roundss $0xa, %xmm0, %xmm0
0000000000178120 roundss $0xa, %xmm0, %xmm0
000000000017856e roundss $0xa, %xmm0, %xmm0
00000000001869f4 roundss $0xa, %xmm0, %xmm0
000000000018e342 roundss $0xa, %xmm0, %xmm0
000000000018e434 roundss $0xa, %xmm0, %xmm0

analyzing libobjc.A.dylib
000000000001d8fe roundss $0xa, %xmm0, %xmm0
000000000001dbb8 roundss $0xa, %xmm0, %xmm0

analyzing libobjc.dylib
000000000001d8fe roundss $0xa, %xmm0, %xmm0
000000000001dbb8 roundss $0xa, %xmm0, %xmm0

analyzing libtailspin.dylib
0000000000007862 roundss $0xa, %xmm0, %xmm0
000000000000a69b roundss $0xa, %xmm0, %xmm0
000000000000b785 roundss $0xa, %xmm0, %xmm0
000000000000b906 roundss $0xa, %xmm0, %xmm0
000000000000c03a roundss $0xa, %xmm0, %xmm0
000000000000c18e roundss $0xa, %xmm0, %xmm0
000000000000c65f roundss $0xa, %xmm0, %xmm0
000000000000cad8 roundss $0xa, %xmm0, %xmm0
000000000000cd34 roundss $0xa, %xmm0, %xmm0
000000000000d0de roundss $0xa, %xmm0, %xmm0
000000000000d661 roundss $0xa, %xmm0, %xmm0
000000000000d7ae roundss $0xa, %xmm0, %xmm0
000000000000dc12 roundss $0xa, %xmm0, %xmm0
000000000000dd82 roundss $0xa, %xmm0, %xmm0
000000000000df08 roundss $0xa, %xmm0, %xmm0
000000000000e054 roundss $0xa, %xmm0, %xmm0
000000000001136a roundss $0xa, %xmm0, %xmm0
00000000000114b6 roundss $0xa, %xmm0, %xmm0
My guess is it's libobjc.A.dylib but can't be sure.
by: ImAndreWhI
Is there any beta kernel for macOS 10.13 Dev Preview..?
