[wangch@im 20210908183557]$ sudo php start_new.php status
Workerman[start_new.php] status
----------------------------------------------GLOBAL STATUS----------------------------------------------------
Workerman version:4.0.19 PHP version:7.3.6
start time:2022-02-17 18:11:55 run 0 days 16 hours
load average: 21.5, 17, 16 event-loop:\Workerman\Events\Event
1 workers 1 processes
worker_name exit_status exit_count
PHPSocketIO 0 0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid memory listening worker_name connections send_fail timers total_request qps status
42850 342M socketIO://0.0.0.0:443 PHPSocketIO 3037 228 1207 2697377 0 [idle]
----------------------------------------------PROCESS STATUS---------------------------------------------------
Summary 342M - - 3037 228 1207 2697377 0 [Summary]
[wangch@im 20210908183557]$ sudo php start_208.php status
Workerman[start_208.php] status
----------------------------------------------GLOBAL STATUS----------------------------------------------------
Workerman version:4.0.19 PHP version:7.3.6
start time:2022-02-17 18:04:50 run 0 days 16 hours
load average: 19.49, 16, 16 event-loop:\Workerman\Events\Event
1 workers 1 processes
worker_name exit_status exit_count
PHPSocketIO 0 0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid memory listening worker_name connections send_fail timers total_request qps status
42584 122M socketIO://0.0.0.0:8002 PHPSocketIO 3527 2 2702 838519 0 [idle]
----------------------------------------------PROCESS STATUS---------------------------------------------------
Summary 122M - - 3527 2 2702 838519 0 [Summary]
之前phpsocket.io 單實(shí)例 頂不住人流峰值,現(xiàn)通過(guò)進(jìn)程間通信新增一個(gè)實(shí)例,已經(jīng)大大減少單實(shí)例的壓力了。
現(xiàn)在兩個(gè)實(shí)例,但是在連接量相同、業(yè)務(wù)邏輯相同的情況下,實(shí)例一 memory 明顯較高,會(huì)導(dǎo)致 實(shí)例一 有 偶爾 的連接阻塞現(xiàn)象。
[wangch@im ~]$ sudo netstat -anp | grep ESTABLISHED | grep -i "443" | wc -l
3005
[wangch@im ~]$ sudo netstat -anp | grep ESTABLISHED | grep -i "8002" | wc -l
3573
[wangch@im ~]$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
TIME_WAIT 27
CLOSE_WAIT 56
FIN_WAIT1 12
ESTABLISHED 6661
SYN_RECV 29
LAST_ACK 350
我在shell中觀察,發(fā)現(xiàn) 當(dāng) CLOSE_WAIT 數(shù)量一超過(guò)個(gè)位數(shù),實(shí)例一就卡頓。當(dāng)沒(méi)有 CLOSE_WAIT 或者 為 1或者為0 的時(shí)候就正常連接,無(wú)阻塞。
能夠解決掉 實(shí)例一 偶爾 阻塞的情況。
服務(wù)器配置是?
另外系統(tǒng)負(fù)載太高了,都20了。系統(tǒng)負(fù)載高可能會(huì)出現(xiàn)業(yè)務(wù)卡頓。
實(shí)例一的請(qǐng)求量是實(shí)例二的3倍多,實(shí)例一內(nèi)存占用高可能和某些請(qǐng)求占用資源有關(guān),這個(gè)不好確定。但是內(nèi)存高不一定是卡頓的原因,除非是系統(tǒng)內(nèi)存不夠開(kāi)始使用swap。
如果你是服務(wù)器是4核心或以上,可以多開(kāi)點(diǎn)實(shí)例比如開(kāi)4個(gè)或者更多,不超過(guò)cpu核數(shù)就行。
服務(wù)器配置:
[wangch@im ~]$ sudo ethtool eth0
Settings for eth0:
Supported ports: [ ]
Supported link modes: Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Advertised link modes: Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
MDI-X: Unknown
Link detected: yes
[wangch@im ~]$ cat /proc/cpuinfo | grep "physical id" | uniq | wc -l
40
[wangch@im ~]$ cat /proc/cpuinfo | grep "cpu cores" | uniq
cpu cores : 10
[wangch@im ~]$ cat /proc/meminfo | grep MemTotal
MemTotal: 66058388 kB
如果這個(gè)服務(wù)器只跑了phpsocket.io項(xiàng)目,那就是有關(guān)系。用 top
命令能大概看出來(lái)哪個(gè)導(dǎo)致負(fù)載高
老大,有個(gè)情況我百思不得其解,今天我把這個(gè)雙服務(wù)分流了一下,現(xiàn)在一個(gè)服務(wù)連接量達(dá)到6k左右,也不阻塞。但是另一個(gè)服務(wù),連接量都沒(méi)到1k,七八百就阻塞了。您能大概的幫我定位一下原因嗎。
[wangch@im 20210908183557]$ sudo php start_new.php status
Workerman[start_new.php] status
----------------------------------------------GLOBAL STATUS----------------------------------------------------
Workerman version:4.0.19 PHP version:7.3.6
start time:2022-02-18 18:22:15 run 2 days 21 hours
load average: 4.24, 6, 6 event-loop:\Workerman\Events\Event
1 workers 1 processes
worker_name exit_status exit_count
PHPSocketIO 0 0
----------------------------------------------PROCESS STATUS---------------------------------------------------
pid memory listening worker_name connections send_fail timers total_request qps status
14412 388.5M socketIO://0.0.0.0:443 PHPSocketIO 463 753 162 3592875 0 [idle]
----------------------------------------------PROCESS STATUS---------------------------------------------------
Summary 1940M - - 2359 3765 466 17963935 0 [Summary]
這是阻塞的那個(gè)服務(wù)的狀態(tài)。
關(guān)于分流:
現(xiàn)在APP用戶量小,我把APP聊天單用一個(gè)服務(wù),PC網(wǎng)頁(yè)用戶量較大,PC網(wǎng)頁(yè)單用一個(gè)服務(wù),也就是按PC和APP來(lái)連接不同的服務(wù)。
阻塞現(xiàn)象:
1、每個(gè)服務(wù)都有一個(gè)不同的連接域名,當(dāng)一個(gè)請(qǐng)求進(jìn)行連接,并沒(méi)有馬上連接成功,而是等待了十多秒才能夠連接成功。
2、瀏覽器上直接訪問(wèn)這個(gè)連接域名也可以看出,如果連接成功會(huì)秒出現(xiàn){
"code": 0,
"message": "Transport unknown"
},相反則會(huì)一直轉(zhuǎn)圈。
3、查看該服務(wù)(端口443)的CLOSE_WAIT 數(shù)量也可看出
[wangch@im 20210908183557]$ sudo netstat -anp | grep CLOSE_WAIT | grep -i "443" | wc -l
68
連接不成功的時(shí)候php start.php status
看下有沒(méi)有狀態(tài)為busy的進(jìn)程,如果有用strace看下阻塞在哪里。
http://wtbis.cn/doc/workerman/debug/busy-process.html
好的老大,我去查下。
另外,我開(kāi)雙服務(wù)只是在目錄里新增一個(gè)start文件,不需要再單獨(dú)拿出來(lái)重新弄一套(完整的compser目錄)吧?
這個(gè)姿勢(shì)對(duì)嗎,我怕這一步都沒(méi)對(duì)。。。再去查別的也沒(méi)用。
16:48:18.879367 poll([{fd=69, events=POLLIN|POLLOUT|POLLERR|POLLHUP}], 1, 15000) = 1 ([{fd=69, revents=POLLOUT}])
16:48:18.880647 getsockopt(69, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
16:48:18.880696 fcntl(69, F_SETFL, O_RDWR) = 0
16:48:18.880747 sendto(69, "POST /internaldov2.php HTTP/1.0\r"..., 330, MSG_DONTWAIT, NULL, 0) = 330
16:48:18.880818 poll([{fd=69, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
16:48:18.880869 poll([{fd=69, events=POLLIN|POLLERR|POLLHUP}], 1, 15000) = 1 ([{fd=69, revents=POLLIN}])
16:48:18.901827 recvfrom(69, "HTTP/1.1 200 OK\r\nServer: nginx\r\n"..., 8192, MSG_DONTWAIT, NULL, NULL) = 2142
16:48:18.901893 poll([{fd=69, events=POLLIN|POLLERR|POLLHUP}], 1, 15000) = 1 ([{fd=69, revents=POLLIN}])
16:48:18.901948 recvfrom(69, "", 8192, MSG_DONTWAIT, NULL, NULL) = 0
16:48:18.901992 poll([{fd=69, events=POLLIN|POLLERR|POLLHUP}], 1, 15000) = 1 ([{fd=69, revents=POLLIN}])
16:48:18.902051 recvfrom(69, "", 8192, MSG_DONTWAIT, NULL, NULL) = 0
16:48:18.902106 close(69) = 0
16:48:18.902326 socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 69
16:48:18.902390 fcntl(69, F_GETFL) = 0x2 (flags O_RDWR)
16:48:18.902433 fcntl(69, F_SETFL, O_RDWR|O_NONBLOCK) = 0
16:48:18.902464 connect(69, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("10.10.10.150")}, 16) = -1 EINPROGRESS (Operation now in progress)
strace循環(huán)執(zhí)行這些東西。
其中10.10.10.150是數(shù)據(jù)庫(kù)的ip,但是io操作都是在phpsocketio的自定義方法中,io 阻塞也是會(huì)影響純連接的阻塞嗎
數(shù)據(jù)庫(kù)阻塞會(huì)影響連接。如果10.10.10.150是數(shù)據(jù)庫(kù)ip,看strace連接數(shù)據(jù)庫(kù)的端口貌似錯(cuò)了,連到了80端口了