I have written a basic network packet sniffer with Python. When the packets comes, the program prints broken characters in my gnome-terminal. Program codes:
#!/usr/bin/python
import socket
import struct
import binascii
s = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, 8)
i = 1
while True:
pkt = s.recvfrom(2048)
ethhead = pkt[0][0:14]
eth = struct.unpack("!6s6s2s",ethhead)
print "--------Ethernet Frame %d------" % i
print "Destination MAC: ", binascii.hexlify(eth[0])
print "Source MAC: ", binascii.hexlify(eth[1])
binascii.hexlify(eth[2])
ipheader = pkt[0][14:34] #next 20 bytes
ip_hdr = struct.unpack("!8sB3s4s4s",ipheader)
print "-----------IP------------------"
print "TTL :", ip_hdr[1]
print "Source IP", socket.inet_ntoa(ip_hdr[3])
print "Destination IP", socket.inet_ntoa(ip_hdr[4])
tcpheader = pkt[0][34:54] #extracts next 20 bytes
tcp_hdr = struct.unpack("!HH9ss6s",tcpheader)
print "---------TCP----------"
print "Source Port ", tcp_hdr[0]
print "Destination port ", tcp_hdr[1]
print "Flag ",binascii.hexlify(tcp_hdr[3])
print "\n\n"
i += 1
print pkt[0][54:]
The sample output:
I had set Terminal>Set character Encoding>Unicode(UTF-8) but it did not work either.
I am using Kali Linux 1.1.0, Gnome Terminal v 3.4.1.1.
Bytes in the range 128-255 do not generally form valid UTF-8 strings. Even if the payload is encoded in UTF-8, there is no guarantee that packet boundaries will coincide with code point boundaries.
Perhaps as a workaround you would like to render the packets as e.g. ISO-8859-1 strings, or maybe even (gasp) offer an option to specify an encoding. The only change you need is
pkt[0][54:].decode(encoding)
- if there ever is a packet containing only complete, valid UTF-8 sequences, mentally decoding it from this output should be possible with a bit of training.