This time, I’ll be running the MegaComet tests as per test 3, with kernel logging enabled to see where I’m pushing the TCP stack too far, so that hopefully i can fix it with some configuration.
As per test 3: start 5 EC2 servers ‘ami-221fec4b’. Out of curiosity, I priced it this time. Since my tests will take less than an hour, it’ll cost $0.34 (ec2 large instance hourly cost) * 5 instances = $1.70 to run this test. I can handle that. It’s also probably worth mentioning that in ec2, i configure the firewall to allow all the MegaComet ports open. In the real world, you’d have the ‘application’ port restricted.
echo Configuring TCP stack sudo bash echo "# Settings from http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3" >> /etc/sysctl.conf echo "# Config needed to have enough tcp stack memory:" >> /etc/sysctl.conf echo "net.core.rmem_max = 33554432" >> /etc/sysctl.conf echo "net.core.wmem_max = 33554432" >> /etc/sysctl.conf echo "net.ipv4.tcp_rmem = 4096 16384 33554432" >> /etc/sysctl.conf echo "net.ipv4.tcp_wmem = 4096 16384 33554432" >> /etc/sysctl.conf echo "net.ipv4.tcp_mem = 786432 1048576 26777216" >> /etc/sysctl.conf echo "net.ipv4.tcp_max_tw_buckets = 360000" >> /etc/sysctl.conf echo "net.core.netdev_max_backlog = 2500" >> /etc/sysctl.conf echo "vm.min_free_kbytes = 65536" >> /etc/sysctl.conf echo "vm.swappiness = 0" >> /etc/sysctl.conf echo "# This is for the outgoing connections max:" >> /etc/sysctl.conf echo "net.ipv4.ip_local_port_range = 1024 65535" >> /etc/sysctl.conf echo "# I added this to set the system wide file max:" >> /etc/sysctl.conf echo "fs.file-max = 1100000" >> /etc/sysctl.conf echo "# Reduce the time sockets stay in time_wait: http://forums.theplanet.com/lofiversion/index.php/t62399.html" >> /etc/sysctl.conf echo "net.ipv4.tcp_fin_timeout = 12" >> /etc/sysctl.conf exit sudo sysctl -p echo Enlarging user-limits on files sudo bash echo "* soft nofile 1048576" >> /etc/security/limits.conf echo "* hard nofile 1048576" >> /etc/security/limits.conf exit echo Enabling kernel logging sudo bash echo "kern.* /var/log/kern.log" >> /etc/rsyslog.conf sudo service rsyslog restart exit echo Installing build essentials sudo yum -y install gcc* git* make echo Installing libev cd ~ wget http://dist.schmorp.de/libev/libev-4.04.tar.gz tar -zxvf libev-4.04.tar.gz cd libev-4.04 ./configure && make && sudo make install echo Adding libev to the library list sudo sh -c "echo /usr/local/lib > /etc/ld.so.conf.d/usr-local-lib.conf" sudo ldconfig echo Installing MC cd ~ git clone git://github.com/chrishulbert/MegaComet.git cd MegaComet make cd testing make echo Now you have to logout and in again, because you only have a low per-user limit as you can see: ulimit -n
Once MC started on the first instance, i run this to view the kernel logs:
sudo tail -f /var/log/kern.log
On the test instances (2-5):
cd ~/MegaComet/testing ./megatest A 10.40.29.57
I can only get up to 494k connections. On the server, here is the top output when at maximum. As you can see, there’s plenty of ram free:
top - 11:56:56 up 36 min, 4 users, load average: 0.00, 0.14, 0.20 Tasks: 84 total, 1 running, 83 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 7652552k total, 2566036k used, 5086516k free, 20492k buffers Swap: 0k total, 0k used, 0k free, 798960k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3373 ec2-user 20 0 8664 336 264 S 0.0 0.0 0:00.00 megamanager 3374 ec2-user 20 0 27592 18m 460 S 0.0 0.2 0:15.90 megacomet 3375 ec2-user 20 0 26640 17m 460 S 0.0 0.2 0:15.40 megacomet 3376 ec2-user 20 0 26660 17m 460 S 0.0 0.2 0:15.29 megacomet 3377 ec2-user 20 0 26680 18m 460 S 0.0 0.2 0:15.27 megacomet 3378 ec2-user 20 0 27420 18m 460 S 0.0 0.2 0:16.59 megacomet 3379 ec2-user 20 0 27304 18m 460 S 0.0 0.2 0:15.81 megacomet 3380 ec2-user 20 0 26828 17m 460 S 0.0 0.2 0:15.52 megacomet 3381 ec2-user 20 0 27188 18m 460 S 0.0 0.2 0:15.60 megacomet
And the slabtop output:
Active / Total Objects (% used) : 3713138 / 3713562 (100.0%) Active / Total Slabs (% used) : 154792 / 154792 (100.0%) Active / Total Caches (% used) : 58 / 74 (78.4%) Active / Total Size (% used) : 1479546.63K / 1479691.42K (100.0%) Minimum / Average / Maximum Object : 0.01K / 0.40K / 8.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 524160 524158 99% 0.19K 24960 21 99840K dentry 496797 496642 99% 0.19K 23657 21 94628K kmalloc-192 496000 496000 100% 0.06K 7750 64 31000K kmalloc-64 495328 495328 100% 0.12K 15479 32 61916K kmalloc-128 495040 495040 100% 0.07K 8840 56 35360K blkdev_ioc 495000 495000 100% 0.62K 41250 12 330000K sock_inode_cache 494950 494950 100% 1.62K 26050 19 833600K TCP 163410 163385 99% 0.10K 4190 39 16760K buffer_head
Nothing appeared in the kernel log. So, for now, i’m not sure what the holdup is: No kernel errors, didn’t hit a memory ceiling, i’m puzzled.
Thanks for reading! And if you want to get in touch, I'd love to hear from you: chris.hulbert at gmail.
(Comp Sci, Hons - UTS)
Software Developer (Freelancer / Contractor) in Australia.
I have worked at places such as Google, Cochlear, Assembly Payments, News Corp, Fox Sports, NineMSN, FetchTV, Coles, Woolworths, Trust Bank, and Westpac, among others. If you're looking for help developing an iOS app, drop me a line!
Get in touch:
[email protected]
github.com/chrishulbert
linkedin