Welcome to AMD OS X

Why not register now to gain full access.

Development and testing of modified XNU Kernels for AMD CPUs on OS X
User avatar
by: Shaneee
Maamoun wrote:Hey
can i help in emulating SSE4.1 / 4.2 Instructions? I'm skilled in C++ and ASM so can anyone get me started and point me out to the places where i can start? thank you.
Sure that would be great. Take a look here, viewtopic.php?f=14&t=1150


User avatar
by: Maamoun
I put my hand upon SSE instructions, actually it's AVX2 instructions. It's in osfmk/x86_64/lz4_decode_x86_64.s assembly file, this file didn't exist in previous versions. However it's self-emulated, at compile time the compiler choose AVX2 instructions. If we add this line:
Code: Select all
#undef __AVX2__
at the top of file, the compiler will choose old SSE instructions.
vitorlage liked this
User avatar
by: Shaneee
I'll try build a kernel with that then and send it for testing on an older CPU.

Edit: It didn't make a difference.
vitorlage liked this
User avatar
by: Maamoun
This is the only sse code in the kernel, I don't know if the old sse instructions also don't work or "#undef __AVX2__" have no effect, the problem I don't have macOS to do some tests however I'm in discord for discuss that.
User avatar
by: Shaneee
The issue seems to be with SSE4.1 or 4.2. There was some added to the opcode emulator but not a lot. See here,
AnV Added a few SSE42 instructions
Code: Select all
User avatar
by: Shaneee
I'm not sure how to go about adding them. Netkas found some SSE4 instructions outside the kernel,
I found some sse4,1 opcodes in libraries, not in kernel.

otool -Vvt libobjc.A.dylib | grep roundss
000000000001c9dc roundss $0xa, %xmm0, %xmm0
000000000001cb28 roundss $0xa, %xmm0, %xmm0

otool -Vvt libbsm.0.dylib | grep pmovsxdq
000000000000e457 pmovsxdq (%rbx), %xmm0
000000000000e461 pmovsxdq 0x8(%rbx), %xmm0
000000000000e4fa pmovsxdq (%rbx), %xmm0
000000000000e504 pmovsxdq 0x8(%rbx), %xmm0
User avatar
by: Maamoun
If there are not so much of libraries that use these instructions, I suggest patching them. I can do that using IDA.

Edit: There are too much of them, that's impossible except if someone tell us which the libraries that cause the problem.
User avatar
by: Shaneee
Yeah there's a lot. This is a list of them all,
Code: Select all
analyzing libChineseTokenizer.dylib
0000000000001e3e roundss $0xa, %xmm0, %xmm0
0000000000002199 roundss $0xa, %xmm0, %xmm0
00000000000026de roundss $0xa, %xmm0, %xmm0
0000000000003253 roundss $0xa, %xmm0, %xmm0

analyzing libFosl_dynamic.dylib
0000000000112368 roundss $0x9, %xmm0, %xmm0
0000000000112428 roundss $0xb, %xmm0, %xmm0
00000000001125ba roundss $0x9, %xmm2, %xmm0
00000000001125fa roundss $0xa, %xmm0, %xmm2
00000000001126e8 roundss $0xa, %xmm0, %xmm0
00000000001127a8 roundss $0x9, %xmm0, %xmm1
0000000000112873 roundss $0x9, %xmm2, %xmm2

analyzing libIASUnifiedProgress.dylib
00000000000011f0 roundss $0xa, %xmm0, %xmm2

analyzing libate.dylib
000000000000c5ef roundss $0xb, %xmm5, %xmm4
000000000000c5fd roundss $0xb, %xmm0, %xmm0
000000000000c60b roundss $0xb, %xmm3, %xmm3
000000000001f4f1 roundss $0x4, %xmm0, %xmm0
000000000002b01a roundss $0x4, %xmm15, %xmm0
0000000000044d82 vroundss $0xb, %xmm4, %xmm0, %xmm4
0000000000044d90 vroundss $0xb, %xmm0, %xmm0, %xmm0
0000000000044d9e vroundss $0xb, %xmm3, %xmm0, %xmm3
00000000000581c5 vroundss $0x4, %xmm1, %xmm0, %xmm1
0000000000060609 vroundss $0x4, %xmm0, %xmm0, %xmm0
0000000000070747 roundss $0xb, %xmm5, %xmm4
0000000000070755 roundss $0xb, %xmm0, %xmm0
0000000000070763 roundss $0xb, %xmm3, %xmm3
0000000000083649 roundss $0x4, %xmm0, %xmm0
000000000008ec04 roundss $0x4, %xmm15, %xmm0

analyzing libdtrace.dylib
000000000004086f roundss $0xa, %xmm0, %xmm0
0000000000040c49 roundss $0xa, %xmm0, %xmm0
000000000004106a roundss $0xa, %xmm0, %xmm0
000000000004210b roundss $0xa, %xmm0, %xmm0
0000000000043d1e roundss $0xa, %xmm0, %xmm0
00000000000447f8 roundss $0xa, %xmm0, %xmm0

analyzing libmecabra.dylib
000000000003120d roundss $0xa, %xmm0, %xmm0
0000000000031414 roundss $0xa, %xmm0, %xmm0
000000000003160e roundss $0xa, %xmm0, %xmm0
0000000000087f88 roundss $0xa, %xmm0, %xmm0
0000000000099d92 roundss $0xa, %xmm0, %xmm0
000000000009ac7a roundss $0xa, %xmm0, %xmm0
000000000009ad5e roundss $0xa, %xmm0, %xmm0
000000000009aec2 roundss $0xa, %xmm0, %xmm0
000000000009b5ea roundss $0xa, %xmm0, %xmm0
000000000009be4a roundss $0xa, %xmm0, %xmm0
00000000000a18de roundss $0xa, %xmm0, %xmm0
00000000000a1b10 roundss $0xa, %xmm0, %xmm0
00000000000f893a roundss $0xa, %xmm0, %xmm0
00000000000f8b32 roundss $0xa, %xmm0, %xmm0
00000000000f8d14 roundss $0xa, %xmm0, %xmm0
00000000000fa17e roundss $0xa, %xmm0, %xmm0
00000000000fa94e roundss $0xa, %xmm0, %xmm0
00000000000fd2f8 roundss $0xa, %xmm0, %xmm0
0000000000102031 roundss $0xa, %xmm0, %xmm0
000000000010228e roundss $0xa, %xmm0, %xmm0
0000000000112b6b roundss $0xa, %xmm0, %xmm0
0000000000112dde roundss $0xa, %xmm0, %xmm0
0000000000113509 roundss $0xa, %xmm0, %xmm0
000000000011387a roundss $0xa, %xmm0, %xmm0
000000000011f0c0 roundss $0xa, %xmm0, %xmm0
000000000012383c roundss $0xa, %xmm0, %xmm0
0000000000126140 roundss $0xa, %xmm0, %xmm0
0000000000126990 roundss $0xa, %xmm0, %xmm0
000000000012a26c roundss $0xa, %xmm0, %xmm0
000000000012bf64 roundss $0xa, %xmm0, %xmm0
000000000012d204 roundss $0xa, %xmm0, %xmm0
0000000000134a2d roundss $0xa, %xmm0, %xmm0
0000000000135d6f roundss $0xa, %xmm0, %xmm0
00000000001393fa roundss $0xa, %xmm0, %xmm0
000000000013958e roundss $0xa, %xmm0, %xmm0
0000000000141591 roundss $0xa, %xmm0, %xmm0
00000000001417d2 roundss $0xa, %xmm0, %xmm0
0000000000143362 roundss $0xa, %xmm0, %xmm0
0000000000143bcf roundss $0xa, %xmm0, %xmm0
0000000000143f48 roundss $0xa, %xmm0, %xmm0
000000000015b769 roundss $0xa, %xmm0, %xmm0
00000000001626a2 roundss $0xa, %xmm0, %xmm0
0000000000163b78 roundss $0xa, %xmm0, %xmm0
0000000000164e3e roundss $0xa, %xmm0, %xmm0
00000000001705ab roundss $0xa, %xmm0, %xmm0
0000000000172392 roundss $0xa, %xmm0, %xmm0
00000000001752e4 roundss $0xa, %xmm0, %xmm0
000000000017550e roundss $0xa, %xmm0, %xmm0
0000000000177343 roundss $0xa, %xmm0, %xmm0
000000000017753c roundss $0xa, %xmm0, %xmm0
0000000000177735 roundss $0xa, %xmm0, %xmm0
000000000017792e roundss $0xa, %xmm0, %xmm0
0000000000177b30 roundss $0xa, %xmm0, %xmm0
0000000000177d29 roundss $0xa, %xmm0, %xmm0
0000000000177f2b roundss $0xa, %xmm0, %xmm0
0000000000178120 roundss $0xa, %xmm0, %xmm0
000000000017856e roundss $0xa, %xmm0, %xmm0
00000000001869f4 roundss $0xa, %xmm0, %xmm0
000000000018e342 roundss $0xa, %xmm0, %xmm0
000000000018e434 roundss $0xa, %xmm0, %xmm0

analyzing libobjc.A.dylib
000000000001d8fe roundss $0xa, %xmm0, %xmm0
000000000001dbb8 roundss $0xa, %xmm0, %xmm0

analyzing libobjc.dylib
000000000001d8fe roundss $0xa, %xmm0, %xmm0
000000000001dbb8 roundss $0xa, %xmm0, %xmm0

analyzing libtailspin.dylib
0000000000007862 roundss $0xa, %xmm0, %xmm0
000000000000a69b roundss $0xa, %xmm0, %xmm0
000000000000b785 roundss $0xa, %xmm0, %xmm0
000000000000b906 roundss $0xa, %xmm0, %xmm0
000000000000c03a roundss $0xa, %xmm0, %xmm0
000000000000c18e roundss $0xa, %xmm0, %xmm0
000000000000c65f roundss $0xa, %xmm0, %xmm0
000000000000cad8 roundss $0xa, %xmm0, %xmm0
000000000000cd34 roundss $0xa, %xmm0, %xmm0
000000000000d0de roundss $0xa, %xmm0, %xmm0
000000000000d661 roundss $0xa, %xmm0, %xmm0
000000000000d7ae roundss $0xa, %xmm0, %xmm0
000000000000dc12 roundss $0xa, %xmm0, %xmm0
000000000000dd82 roundss $0xa, %xmm0, %xmm0
000000000000df08 roundss $0xa, %xmm0, %xmm0
000000000000e054 roundss $0xa, %xmm0, %xmm0
000000000001136a roundss $0xa, %xmm0, %xmm0
00000000000114b6 roundss $0xa, %xmm0, %xmm0
My guess is it's libobjc.A.dylib but can't be sure.
User avatar
by: ImAndreWhI
Is there any beta kernel for macOS 10.13 Dev Preview..?
long long title how many chars? lets see 123 ok more? yes 60

We have created lots of YouTube videos just so you can achieve [...]

Another post test yes yes yes or no, maybe ni? :-/

The best flat phpBB theme around. Period. Fine craftmanship and [...]

Do you need a super MOD? Well here it is. chew on this

All you need is right here. Content tag, SEO, listing, Pizza and spaghetti [...]

Lasagna on me this time ok? I got plenty of cash

this should be fantastic. but what about links,images, bbcodes etc etc? [...]