[(app name).rhcloud.com (username)]\> ps aux kUSER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 1313 240483 0.0 0.0 105068 3152 ? S 17:02 0:00 sshd: (user)@pts/1 1313 240486 0.0 0.0 108608 2100 pts/1 Ss 17:02 0:00 /bin/bash --init-file /usr/bin/rhcsh -i 1313 249661 0.1 0.4 397100 35224 ? Sl 17:08 0:00 /usr/bin/mongod --auth -f /var/lib/openshift/(user)/mongodb//conf/mongodb.conf run 1313 261473 5.5 0.0 0 0 ? R 17:15 0:00 [node] 1313 261476 2.0 0.0 110244 1156 pts/1 R+ 17:15 0:00 ps aux 1313 390906 8.1 0.2 1021240 20196 ? Sl Dec10 321:14 node /opt/rh/nodejs010/root/usr/bin/supervisor -e node|js|coffee -p 1000 -- server.js [(app name).rhcloud.com (username)]\> kill 390906
That killed the process "supervisor" that re-spawns the node process. This is generally helpful, but today, it's continually incrementing the PID and it seems like that's happening more often than the gear can attempt to stop it. Unfortunately, now I can't restart it (rerunning that command in the ps output just gave me an error complaining about an Unhandled 'error' event in the supervisor script, so I decided to start the node service myself.
There are a few ways of doing this. You can go to your code and run 'node' or you can use gear start. But if you try gear start, well, it won't start if it thinks it's already running. After killing supervisor, the node process was not attempting to restart, but gear start didn't work either. I tried tricking it by clearing out the $OPENSHIFT_NODEJS_PID_DIR/cartridge.pid file, but that didn't work either... It did point out something I could use though.
[(appname).rhcloud.com (username)]\> gear stop Stopping gear... Stopping NodeJS cartridge usage: kill [ -s signal | -p ] [ -a ] pid ... kill -l [ signal ] Stopping MongoDB cartridge [(appname).rhcloud.com (username]\> gear start Starting gear... Starting MongoDB cartridge Starting NodeJS cartridge Application 'deploy' failed to start An error occurred executing 'gear start' (exit code: 1) Error message: Failed to execute: 'control start' for /var/lib/openshift/(username)/nodejs For more details about the problem, try running the command again with the '--trace' option.
What I found interesting about that was that it apparently tried to pass the empty pid that was in the $OPENSHIFT_NODEJS_PID_DIR/cartridge.pid file along to kill and kill didn't know what to do with that. In fact, kill returns a failed error code if you don't tell it what to kill OR if you tell it to kill something that wasn't there (original issue), so instead of getting an 'okay' back from the kill command when the gear script tried to run it, it got a failure and that meant problems for gear. So, I thought if I got something running on a PID that it COULD kill and put that PID in the file, it'd kill it successfully and everything would be back to normal. Easiest thing I could think of was to stick the '}' in my script that I'd forgotten and run that.
The node code is stored in /app-deloyments/<datestamp>/repo/ .. but don't expect things you put here to stick around.
\> node server.js ^Z \> ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 1313 240483 0.0 0.0 105068 3152 ? S 17:02 0:00 sshd: (user)@pts/1 1313 240486 0.0 0.0 108608 2124 pts/1 Ss 17:02 0:00 /bin/bash --init-file /usr/bin/rhcsh -i 1313 275483 0.3 0.4 467788 36892 ? Sl 17:24 0:01 /usr/bin/mongod --auth -f /var/lib/openshift/(user)/mongodb//conf/mongodb.conf run 1313 284292 2.5 0.6 732440 45924 pts/1 Sl 17:30 0:02 node server.js 1313 287036 2.0 0.0 110240 1156 pts/1 R+ 17:32 0:00 ps aux \> echo "284292" > $OPENSHIFT_NODEJS_PID_DIR/cartridge.pid
So, PID is in the file, and the PID is a valid running node process. Then I did my git commit of my fix, and ran git push... and it was back to normal!
Counting objects: 5, done. Delta compression using up to 8 threads. Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 344 bytes | 0 bytes/s, done. Total 3 (delta 2), reused 0 (delta 0) remote: Stopping NodeJS cartridge remote: Stopping MongoDB cartridge remote: Saving away previously installed Node modules remote: Building git ref 'master', commit f5e40ef remote: Building NodeJS cartridge remote: npm info it worked if it ends with ok ... remote: npm info ok remote: Preparing build for deployment remote: Deployment id is aa38fed5 remote: Activating deployment remote: Starting MongoDB cartridge remote: Starting NodeJS cartridge remote: Result: success remote: Activation status: success remote: Deployment completed with status: success
So, now that the PID was stable and correct, it seemed to deploy properly and I've had no troubles since!