Home » Network Troubleshooting Methodology: OSI Layer, Packet Capture, Wireshark, Traceroute และ Root Cause
Network Troubleshooting Methodology: OSI Layer, Packet Capture, Wireshark, Traceroute และ Root Cause
Network Troubleshooting Methodology: OSI Layer, Packet Capture, Wireshark, Traceroute และ Root Cause
Network Troubleshooting ต้องใช้ methodology ที่เป็นระบบเพื่อหา root cause อย่างรวดเร็ว OSI Layer approach วิเคราะห์ปัญหาทีละชั้น, Packet Capture จับ traffic จริงเพื่อวิเคราะห์, Wireshark เป็น tool มาตรฐานสำหรับ packet analysis, Traceroute หา path และ latency ของ traffic และ Root Cause Analysis หา root cause ที่แท้จริงแทนที่จะแก้แค่ symptom
Network engineers ส่วนใหญ่ troubleshoot แบบ random: ลองรีบูท, เปลี่ยนสาย, restart service โดยไม่มี methodology → เสียเวลาหลายชั่วโมง ทั้งที่ปัญหาอาจแก้ได้ใน 10 นาทีถ้ามี systematic approach การใช้ OSI layer, divide-and-conquer, และ proper tools (Wireshark, traceroute, SNMP) ช่วยหา root cause เร็วขึ้น 10 เท่า
Troubleshooting Methodology
| Step |
Action |
Detail |
| 1. Define Problem |
Gather information |
Who, what, when, where? — ถามผู้ใช้, check monitoring, review recent changes |
| 2. Gather Facts |
Collect data |
Logs, SNMP data, interface counters, error messages, packet captures |
| 3. Consider Possibilities |
List probable causes |
ใช้ OSI model, experience, common issues → list top 3-5 causes |
| 4. Create Action Plan |
Plan tests |
เรียงลำดับ: ทดสอบ most likely cause ก่อน, least disruptive tests ก่อน |
| 5. Implement + Test |
Execute one change at a time |
เปลี่ยนทีละอย่าง → ทดสอบ → ถ้าไม่ใช่ → revert → ทดสอบสิ่งถัดไป |
| 6. Observe Results |
Verify fix |
ปัญหาหายจริงไหม? ไม่สร้างปัญหาใหม่? ทดสอบ end-to-end |
| 7. Document |
Record solution |
บันทึก: problem, root cause, solution, prevention → knowledge base |
OSI Layer Troubleshooting
| Layer |
Check |
Tools |
| L1 Physical |
Cable, SFP, link light, CRC errors, speed/duplex mismatch |
cable tester, show interface, LED indicators |
| L2 Data Link |
MAC table, VLAN, STP, ARP, trunk/access mode, err-disabled |
show mac address-table, show vlan, show spanning-tree |
| L3 Network |
IP address, subnet mask, routing table, default gateway, ACL |
ping, traceroute, show ip route, show ip interface |
| L4 Transport |
TCP/UDP ports, firewall rules, NAT, connection state |
telnet/nc port test, netstat, show firewall, packet capture |
| L7 Application |
DNS resolution, HTTP status, application logs, TLS/SSL |
nslookup/dig, curl, application logs, openssl s_client |
Essential Troubleshooting Tools
| Tool |
Purpose |
Usage |
| ping |
Test L3 reachability + latency |
ping [destination] — check packet loss, RTT, TTL |
| traceroute/tracert |
Show path + per-hop latency |
traceroute [dest] — identify which hop has issue |
| mtr |
Continuous traceroute + statistics |
mtr [dest] — combine ping + traceroute with loss/jitter stats |
| nslookup/dig |
DNS resolution testing |
dig [domain] @[server] — check DNS response, TTL, records |
| telnet/nc |
TCP port connectivity test |
telnet [host] [port] — verify port open and reachable |
| tcpdump |
CLI packet capture |
tcpdump -i eth0 -w capture.pcap — capture for Wireshark |
| Wireshark |
GUI packet analyzer |
Deep packet inspection, flow analysis, protocol decode |
| iperf3 |
Bandwidth/throughput testing |
iperf3 -c [server] — measure actual throughput between 2 points |
Wireshark Analysis
| Filter |
Purpose |
Example |
| ip.addr == x.x.x.x |
Filter by IP address |
ip.addr == 192.168.1.100 — all traffic to/from this IP |
| tcp.port == 443 |
Filter by port |
tcp.port == 80 — all HTTP traffic |
| tcp.analysis.retransmission |
Find retransmissions |
Indicates packet loss or latency issues |
| tcp.analysis.zero_window |
Find zero window events |
Receiver buffer full — application not reading fast enough |
| dns |
DNS queries and responses |
Check resolution time, NXDOMAIN, wrong answers |
| tcp.flags.syn == 1 && tcp.flags.ack == 0 |
New TCP connections (SYN only) |
Count new connections, detect SYN floods |
| frame.time_delta > 1 |
Packets with > 1 second gap |
Identify delays and slow responses |
Traceroute Analysis
| Pattern |
Meaning |
Action |
| * * * (all hops) |
ICMP blocked or destination unreachable |
Try TCP traceroute (tcptraceroute) or UDP |
| Latency spike at hop N |
Congestion or distance at that hop |
Check if consistent → possible congestion on that link |
| Latency spike then normal |
Router slow to respond (ICMP rate limit) but forwards fine |
Not a real problem — router deprioritizes ICMP replies |
| Increasing latency from hop N |
Real congestion starting at hop N |
Contact ISP/provider managing that hop |
| Path changes between traces |
Load balancing or routing instability |
Use Paris traceroute (fixes per-flow ECMP) |
Common Problems + Root Cause
| Symptom |
Common Root Cause |
Verification |
| No connectivity |
Cable, VLAN mismatch, port shutdown, STP blocking |
Check L1 (link light), L2 (VLAN, STP), L3 (IP, route) |
| Slow performance |
Duplex mismatch, congestion, MTU mismatch, DNS slow |
show interface (errors/drops), iperf3, packet capture |
| Intermittent drops |
Flapping link, STP reconvergence, microbursts, bad cable |
show log, show interface (input/output errors), SNMP trending |
| Can’t reach specific host |
ACL blocking, route missing, firewall rule, ARP issue |
traceroute, show ip route, show access-lists, show arp |
| DNS resolution fails |
Wrong DNS server, DNS unreachable, domain expired |
nslookup/dig, check DNS server connectivity |
ทิ้งท้าย: Troubleshooting = Methodology + Tools + Experience
Network Troubleshooting Methodology: define → gather facts → consider causes → plan → implement (one at a time) → verify → document OSI Approach: L1 (physical/cable) → L2 (VLAN/STP/MAC) → L3 (IP/route) → L4 (ports/firewall) → L7 (DNS/app) Tools: ping (reachability), traceroute (path), mtr (continuous), Wireshark (deep analysis), iperf3 (throughput) Wireshark: filter by IP/port, find retransmissions, zero windows, DNS issues, SYN floods, delays Traceroute: spike at hop (congestion), spike then normal (ICMP rate limit), increasing latency (real issue) Common: no connect (L1-L3), slow (duplex/MTU/congestion), intermittent (flapping/STP), DNS (wrong server) Key: systematic approach beats random guessing — always start with “what changed?” and work through OSI layers
อ่านเพิ่มเติมเกี่ยวกับ TCP/IP Deep Dive Handshake Window Congestion และ Network Monitoring SNMP NetFlow gNMI ที่ siamlancard.com หรือจาก icafeforex.com และ siam2r.com