OS X support #51

haileys · 2015-09-17T05:27:31Z

This pull request adds OS X support to Semian.

The biggest potentially controversial change here is that I've removed the ticket acquisition timeout entirely. This is because OS X doesn't have semtimedop. I asked @sirupsen about this and he backgrounded me on the original motivation for this timeout and why it isn't strictly necessary anymore.

Because we no longer block on semaphore operations, I was also able to remove the GVL release code.

cc @sirupsen @csfrancis @byroot

this is already declared by OS X system headers and causes compilation errors

byroot · 2015-09-17T05:30:25Z

this timeout and why it isn't strictly necessary anymore.

Yes. We recently removed all usage of it. So it's fine. We'll have to do a major version bump though.

Flagging this for later review even though @csfrancis is the authoritative person here.

Also you have a CI failure that seems legit.

haileys · 2015-09-17T05:31:11Z

The other thing I discussed with @sirupsen is the use of SHA1 to generate Semaphore keys from strings. There's a slight potential for this to generate the same key for multiple resources, or even between totally different applications that both happen to use SysV semaphores.

Is there any reason why Semian uses SHA1 for this and not ftok? ftok generates a key from a file path and guarantees uniqueness this way, rather than SHA1's probabilistic uniqueness.

haileys · 2015-09-17T05:41:45Z

@byroot Pushed up f7372fb which fixes that failure. Turns out that test was implicitly relying on timeouts (as it tried to acquire the resource immediately after sending SIGKILL to the child process that had already acquired the resource). I've added a call to Process.waitpid after Process.kill so we only try to acquire the resource after the child process has died.

sirupsen · 2015-09-17T13:59:27Z

The argument I have for not requiring a ticket count I tried to explain to Charlie last night is that the equation for a ticket count's effectiveness effectively looks like this: p^t = q, where p is probability of sampling a worker talking to the resource that has ticket count t, an q is the accepted false positive rate for sampling t workers talking to the resource. When t grows beyond 6-8 because of this exponential relationship a timeout has a very questionable value, and makes it harder to argue about an effective ticket count. A t of, say, 8 is as appropriate for 15 workers as it is for 80 workers (this means that it's better, for Semian, to have more workers on the same box as you reduce capacity less during a failure in that case). The reason we introduced the timeout was that we ran SysV namespaces with only 4 workers, which meant that we couldn't go beyond a ticket count of 2 and thus needed another mechanism to tune. However, we no longer do this and I don't anticipate us doing this again—and this is why I think it's absolutely fine to remove semtimedop in favour of never having a timeout.

sirupsen · 2015-09-17T17:17:40Z

I've put it on my list to test this PR in production. I'll attempt to do that next week. Nothing really jumps out, thanks @charliesome :)

csfrancis · 2015-09-17T19:35:04Z

ext/semian/extconf.rb

@@ -23,7 +23,7 @@
 have_func 'rb_thread_blocking_region'
 have_func 'rb_thread_call_without_gvl'

-$CFLAGS = "-D_GNU_SOURCE -Werror -Wall "
+$CFLAGS = "-D_GNU_SOURCE -Wall "


Why did you have to get rid of -Werror?

SHA1 is deprecated in the OS X system OpenSSL headers and causes a warning.

Lets switch to something else then rather than disabling the warning

@fw42 Charlie already suggested following up with ftok

Switching to ftok will likely require changes in the external Semian API. This might not be a big deal since this branch will already require a major version bump, but using ftok properly will not be as simple as just replacing the SHA1 call.

We could pass -Wno-deprecated-declarations to disable this specific warning, but I'm not sure how that will affect compilers that don't support that option.

byroot · 2015-09-18T03:28:37Z

I tested this branch with shopify in dev and test. Seems all green. We can try it out on production servers once the last concerns are settled.

haileys · 2015-09-18T04:34:07Z

Since this pull request will already require a major version bump, I'd like to discuss another breaking change that we could make while we're at it.

IMO we should use ftok rather than SHA1 to generate semaphore keys. ftok returns a key which is generated from a file's identity on disk rather than just being a simple digest. This guarantees that there won't be clashes between two semaphores. While clashes are fairly unlikely with SHA1, (1 in 4 billion with a 32 bit key_t), a guarantee that a clash won't happen is even better 😁

Using ftok requires that we have a file on disk per resource that is visible to all workers. We could just use /tmp/semian.#{resource_name}.sem or something, but this feels a little sketchy and makes more assumptions about environments that Semian will be used in than I'm comfortable making - it assumes that /tmp exists and is writable (in this brave new world of containerising everything, this may not be a reasonable assumption). It also assumes that the all processes sharing an IPC namespace also share the same root directories and mount namespaces - again, not necessarily a reasonable assumption these days.

I think the ideal solution here is to add a parameter to Semian::Resource (and expose that up through Semian.register) that allows the user to customise the filename used for ftok. That could be something in /tmp by default, or it could be a required parameter.

SHA1 is deprecated on OS X and causes a build warning

byroot · 2015-09-18T05:54:15Z

Using ftok requires that we have a file on disk per resource that is visible to all workers.

This (unless I'm missing something) is I'm afraid impossible for us since we're running our processes in multiple docker containers. We do share the IPC namespace between containers, but not the filesytem.

sirupsen · 2015-09-22T00:32:20Z

Yeah, @byroot has a valid point for our deployment. I really don't want to do a shared mount between all the containers just for this :/

csfrancis · 2015-09-22T00:33:24Z

While clashes are fairly unlikely with SHA1, (1 in 4 billion with a 32 bit key_t), a guarantee that a clash won't happen is even better

ftok isn't bulletproof either, unfortunately (from the BSD docs on OSX):

BUGS
The returned key is computed based on the device minor number and inode of the specified path, in combination with the lower 8 bits of the given id. Thus, it is quite possible for the routine to return duplicate keys.

csfrancis · 2015-09-22T00:38:36Z

Perhaps we could do something like store a mapping of all active resource_id -> sem_ids in a hash to detect collisions? It seems unlikely that we would encounter one, but if we did, the user would definitely want to know about it.

byroot · 2015-09-22T00:40:28Z

I think you are going to far guys. We're talking about hashing human identifiers. Even MD5 would do. The risk of accidental collisions is extremely low.

haileys · 2015-09-22T03:03:12Z

ftok isn't bulletproof either, unfortunately

Ugh, I missed that. That's a shame. I guess SHA1 is good enough in that case then

sirupsen · 2015-11-11T16:03:33Z

I'd question again why we even need this. Why can't we just disable this in test and dev? @charliesome

byroot · 2015-11-11T16:36:06Z

@sirupsen because then you can't run your integration tests on OSX. :sad_panda:

sirupsen · 2015-11-11T16:38:47Z

Yeah, but those don't fork etc., so you should be able to use the in-memory adapter, no?

Unless you mean the Semian test suite.

byroot · 2015-11-11T16:40:52Z

Oh. Well, if we end up having an in-memory semaphore, then yes.

Soleone · 2016-05-24T19:36:34Z

any updates on this by chance? this came up to test certain semian details in development mode in osx

bf4 · 2017-05-28T02:21:23Z

@sirupsen @charliesome This looks like an interesting PR but is about a year and a half old... any reason it's still open? looks like it's targeted for only Linux in prod, so it could be closed with 'make sure your dev environment is linux; we don't expect support for BSD/OSX'?

sirupsen · 2017-05-28T12:07:26Z

@bf4 we're still interested in this work upstream, someone just needs to own it :)

Charlie Somerville added 5 commits September 9, 2015 18:02

darwin is supported too

b12210f

remove union semun declaration

38592f4

this is already declared by OS X system headers and causes compilation errors

SEM_INFO is not available on OS X

21b4880

remove timeout param from perform_semop and always pass IPC_NOWAIT

d657f75

update perform_semop callsites to not pass a timeout arg

8eef793

haileys force-pushed the osx branch from ff63793 to f7372fb Compare September 17, 2015 05:38

csfrancis reviewed Sep 17, 2015
View reviewed changes

Charlie Somerville added 7 commits September 18, 2015 14:48

rip ticket timeout stuff out from c extension

f1ee043

change wording on Semian::TimeoutError exception message

bdc21aa

no need to release GVL, semops no longer block

a01a524

rename init_max_semaphore_count to get_max_semaphore_count

139dfc5

remove timeout stuff from ruby code and docs

fd286d6

remove -Werror from CFLAGS

390cca9

SHA1 is deprecated on OS X and causes a build warning

wait for child proces to die before re-acquiring resource

be4e881

haileys force-pushed the osx branch from f7372fb to be4e881 Compare September 18, 2015 04:49

sirupsen mentioned this pull request Sep 29, 2015

EDIT: This branch is the iteration target. Implementing a per-host circuit breaker state using shared memory and semaphores #54

Open

sirupsen mentioned this pull request Oct 24, 2018

Net::ResourceBusyError Permission denied #201

Closed

miry mentioned this pull request Jun 10, 2022

Semian sysv semaphores are not supported on x86_64-darwin21 #313

Closed

miry force-pushed the master branch from 98d8601 to 57d2e0d Compare February 8, 2023 10:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OS X support #51

OS X support #51

haileys commented Sep 17, 2015

byroot commented Sep 17, 2015

haileys commented Sep 17, 2015

haileys commented Sep 17, 2015

sirupsen commented Sep 17, 2015

sirupsen commented Sep 17, 2015

csfrancis Sep 17, 2015

haileys Sep 17, 2015

fw42 Sep 17, 2015

sirupsen Sep 18, 2015

haileys Sep 18, 2015

byroot commented Sep 18, 2015

haileys commented Sep 18, 2015

byroot commented Sep 18, 2015

sirupsen commented Sep 22, 2015

csfrancis commented Sep 22, 2015

csfrancis commented Sep 22, 2015

byroot commented Sep 22, 2015

haileys commented Sep 22, 2015

sirupsen commented Nov 11, 2015

byroot commented Nov 11, 2015

sirupsen commented Nov 11, 2015

byroot commented Nov 11, 2015

Soleone commented May 24, 2016

bf4 commented May 28, 2017

sirupsen commented May 28, 2017

OS X support #51

Are you sure you want to change the base?

OS X support #51

Conversation

haileys commented Sep 17, 2015

byroot commented Sep 17, 2015

haileys commented Sep 17, 2015

haileys commented Sep 17, 2015

sirupsen commented Sep 17, 2015

sirupsen commented Sep 17, 2015

csfrancis Sep 17, 2015

Choose a reason for hiding this comment

haileys Sep 17, 2015

Choose a reason for hiding this comment

fw42 Sep 17, 2015

Choose a reason for hiding this comment

sirupsen Sep 18, 2015

Choose a reason for hiding this comment

haileys Sep 18, 2015

Choose a reason for hiding this comment

byroot commented Sep 18, 2015

haileys commented Sep 18, 2015

byroot commented Sep 18, 2015

sirupsen commented Sep 22, 2015

csfrancis commented Sep 22, 2015

csfrancis commented Sep 22, 2015

byroot commented Sep 22, 2015

haileys commented Sep 22, 2015

sirupsen commented Nov 11, 2015

byroot commented Nov 11, 2015

sirupsen commented Nov 11, 2015

byroot commented Nov 11, 2015

Soleone commented May 24, 2016

bf4 commented May 28, 2017

sirupsen commented May 28, 2017