Home » Network Troubleshooting: Methodology, Tools, Packet Capture, SNMP Traps และ Root Cause Analysis
Network Troubleshooting: Methodology, Tools, Packet Capture, SNMP Traps และ Root Cause Analysis
Network Troubleshooting: Methodology, Tools, Packet Capture, SNMP Traps และ Root Cause Analysis
Network Troubleshooting เป็นทักษะสำคัญที่สุดของ network engineer Methodology ให้ systematic approach ในการแก้ปัญหา, Tools เช่น ping, traceroute, nslookup ช่วยระบุปัญหา, Packet Capture ด้วย Wireshark/tcpdump ให้เห็น traffic จริง, SNMP Traps แจ้งเตือน events อัตโนมัติ และ Root Cause Analysis หาสาเหตุที่แท้จริงเพื่อแก้ปัญหาถาวร
Network issues มักถูกรายงานเป็น “เน็ตช้า” หรือ “เข้าเว็บไม่ได้” ซึ่งอาจมีสาเหตุนับสิบ: DNS failure, routing loop, bandwidth saturation, packet loss, MTU mismatch, firewall block, server down การมี systematic methodology ช่วยให้หาสาเหตุได้เร็วและแม่นยำ แทนที่จะ guess แบบสุ่ม
Troubleshooting Methodology
| Step |
Action |
Example |
| 1. Identify Problem |
รวบรวม symptoms จาก user/monitoring |
“Users in Building A ไม่สามารถเข้า intranet ได้ตั้งแต่ 9am” |
| 2. Gather Information |
ถามคำถาม, ดู logs, check monitoring |
Who affected? When started? What changed? Error messages? |
| 3. Analyze |
ใช้ OSI model / divide-and-conquer |
Physical OK → L2 OK → L3 ping fails → routing issue |
| 4. Propose Hypothesis |
ตั้งสมมติฐานจาก evidence |
“Gateway router ของ Building A มี routing table ผิด” |
| 5. Test Hypothesis |
ทดสอบสมมติฐาน (ใช้ tools) |
Check routing table, traceroute, ping gateway |
| 6. Implement Fix |
แก้ไขตาม root cause |
Fix static route หรือ restart OSPF process |
| 7. Verify |
ยืนยันว่า fix ใช้งานได้ |
Users confirm access restored, monitoring shows green |
| 8. Document |
บันทึก problem, cause, fix, prevention |
Update knowledge base, create runbook |
OSI Layer Troubleshooting
| Layer |
Check |
Tools |
| L1 Physical |
Cable, SFP, link light, power |
Cable tester, OTDR, show interface (CRC errors, input errors) |
| L2 Data Link |
VLAN, MAC table, STP, duplex mismatch |
show mac address-table, show spanning-tree, show interfaces |
| L3 Network |
IP config, routing, ACL, MTU |
ping, traceroute, show ip route, show ip interface |
| L4 Transport |
TCP/UDP port, firewall, NAT |
telnet/nc port check, show access-lists, netstat |
| L7 Application |
DNS, HTTP, application config |
nslookup/dig, curl, application logs |
Essential Tools
| Tool |
Purpose |
Usage |
| ping |
Test L3 reachability + latency |
ping 8.8.8.8 — check if host is reachable |
| traceroute/tracert |
Show path to destination (hop by hop) |
traceroute 8.8.8.8 — find where packets stop/slow down |
| nslookup/dig |
DNS resolution test |
nslookup google.com — verify DNS is working |
| netstat/ss |
Show connections, listening ports |
netstat -an — check if service is listening |
| arp |
Show ARP table (IP to MAC mapping) |
arp -a — check L2 resolution |
| ipconfig/ifconfig |
Show interface IP configuration |
ipconfig /all — verify IP, subnet, gateway, DNS |
| mtr |
Combine ping + traceroute (continuous) |
mtr 8.8.8.8 — find packet loss per hop |
| nmap |
Port scanning, service discovery |
nmap -sV 192.168.1.1 — discover open ports and services |
Packet Capture
| Tool |
Platform |
จุดเด่น |
| Wireshark |
Windows/Mac/Linux (GUI) |
Most popular — deep protocol analysis, display filters, flow graphs |
| tcpdump |
Linux/Mac (CLI) |
Command-line capture — fast, lightweight, scriptable |
| SPAN/Mirror Port |
Switch |
Copy traffic from one port to another for capture |
| Network TAP |
Hardware |
Passive inline capture device (no packet loss, full-duplex) |
| RSPAN/ERSPAN |
Switch/Router |
Remote SPAN — mirror traffic across VLANs or over GRE tunnel |
Wireshark Filters
| Filter |
Purpose |
| ip.addr == 192.168.1.1 |
Filter traffic to/from specific IP |
| tcp.port == 443 |
Filter HTTPS traffic |
| dns |
Filter DNS traffic only |
| tcp.analysis.retransmission |
Show TCP retransmissions (packet loss indicator) |
| http.request.method == “POST” |
Filter HTTP POST requests |
| tcp.flags.syn == 1 && tcp.flags.ack == 0 |
Show new TCP connections (SYN only) |
| frame.time_delta > 1 |
Find gaps > 1 second between packets |
SNMP Monitoring
| Feature |
รายละเอียด |
| SNMP Polling |
NMS query devices ทุก interval (5 min) → collect metrics (CPU, bandwidth, errors) |
| SNMP Traps |
Device ส่ง alert ไป NMS เมื่อเกิด event (link down, high CPU, config change) |
| SNMP v3 |
Encrypted + authenticated (ใช้ v3 เสมอ — v1/v2c ไม่ปลอดภัย) |
| MIBs |
Management Information Base — defines ว่า device expose metrics อะไรบ้าง |
| NMS Tools |
PRTG, Zabbix, Nagios, LibreNMS, SolarWinds |
Root Cause Analysis (RCA)
| Method |
How |
| 5 Whys |
ถาม “ทำไม?” 5 ครั้ง จนถึง root cause (ไม่ใช่แค่ symptom) |
| Fishbone Diagram |
จัดกลุ่มสาเหตุ: People, Process, Technology, Environment |
| Timeline Analysis |
สร้าง timeline ของ events ก่อน/หลัง incident → หา trigger |
| Change Correlation |
ดู change log — ปัญหาเกิดหลัง change อะไร? (most common cause) |
| Fault Tree |
Tree diagram แสดง possible causes → narrow down ด้วย evidence |
Common Network Problems
| Symptom |
Common Causes |
Quick Check |
| No connectivity |
Cable, VLAN, IP config, gateway, DNS |
ping gateway → ping DNS → nslookup |
| Slow performance |
Bandwidth saturation, packet loss, MTU, duplex mismatch |
speedtest, ping (check latency/loss), show interface errors |
| Intermittent connectivity |
STP reconvergence, flapping link, ARP issue, DHCP lease |
show spanning-tree, show log, show interface (up/down count) |
| Can’t reach specific host |
Firewall rule, ACL, routing, ARP |
traceroute, check ACLs, check routing table |
| DNS failure |
DNS server down, misconfigured, cache poisoned |
nslookup (try different DNS servers) |
ทิ้งท้าย: Network Troubleshooting = Systematic Approach Wins
Network Troubleshooting Methodology: identify → gather info → analyze (OSI layers) → hypothesis → test → fix → verify → document Tools: ping, traceroute, nslookup, netstat, mtr, nmap (basic) + Wireshark/tcpdump (deep) Packet Capture: Wireshark (GUI), tcpdump (CLI), SPAN port / TAP (capture point) SNMP: polling (metrics every 5min), traps (event alerts), v3 (encrypted) — NMS tools (Zabbix, PRTG) RCA: 5 Whys, fishbone diagram, timeline analysis, change correlation Common: no connectivity (cable/VLAN/IP), slow (bandwidth/loss/MTU), intermittent (STP/flapping) Key: don’t guess — follow methodology, use right tools, find root cause (not just symptoms)
อ่านเพิ่มเติมเกี่ยวกับ Network Observability OpenTelemetry Prometheus Grafana และ Wireshark Packet Analysis Deep Dive ที่ siamlancard.com หรือจาก icafeforex.com และ siam2r.com