0x00 base64编解码原理

    Base64是一种基于64个可打印字符来表示二进制数据的表示方法,将不可见字符可视化是其一大特点。64个可打印
字符依次为A-Za-z0-9+/,可用6bit二进制数映射这64个字符,对应编码见下表

table1

编码示例

table2

    可见编码过程就是将文本字符对应成二进制后,再六个一组对应成索引,转为编码字符。如果字符串长度不是3的
倍数,则对应的二进制位数不是6的倍数,需要在末尾用0填充。若剩1个字符则在编码结果后加2个‘=’;若剩2个字符则
加1个‘=’。如下示例

table3

    解码过程将上面的示例从下向上看即可,即先丢弃编码后面的‘=’,然后将每个base64字符对应索引转为6bit的二
进制数,再8个一组转为ASCII码字符完成解码,最后若剩下不足8位的,则全部丢弃。
    所以上图紫色方框中的bit位在解码时会被丢弃,换句话说,紫色方框中的bit值不会对解码结果产生影响。一个简
单直观的例子就是QUJDRA==和QUJDRC==解码后都是ABCD。由此我们便可以将隐藏信息插入这些bit位中实现隐写。   

0x01 base64隐写

    可以看出一串base64的编码最多也只有4bit的隐写空间,所以实现隐写往往需要大量编码串。隐写时把明文的每个
字符用8位二进制数表示,由此将整个明文串转为bit串,按顺序填入base64编码串的可隐写位中即可实现隐写。

隐写示例

SSBoYXZlIGhhZCBteSBpbnZpdGF0aW9uIHRvIHRoaXMkd29ybGQncyBmZXN0aXZhbCwkYW5kIHRodXMkbXkkbGlmZSBoYXMkYmVlbiBibGVzc2VkLk==
RWFybHkgaW4gdGhlIGRheSBpdCB3YXMgd2hpc3BlcmVkIHRoYXQgd2Ugc2hvdWxkIHNhaWwgaW4gYSBib2F0LH==
b25seSB0aG91IGFuZCBJLG==
YW5kIG5ldmVyIGEgc291bCBpbiB0aGUgd29ybGQgd291bGQga25vdyBvZiB0aGlzIG91ciBwaWxncmltYWdlIHRvIG5vIGNvdW50cnkgYW5kIHRvIG5vIGVuZC6=
SW4gdGhlIG1lYW53aGlsZSBJIHNtaWxlIGFuZCBJIHNpbmcgYWxsIGFsb25lLiBJbiB0aGUgbWVhbndoaWxlIHRoZSBhaXIgaXMgZmlsbGluZyB3aXRoIHRoZSBwZXJmdW1lIG9mIHByb21pc2Uu
VGhlIHRpbWUgdGhhdCBteSBqb3VybmV5IHRha2VzIGlzIGxvbmcgYW5kIHRoZSB3YXkgb2YgaXQgbG9uZy5=
SSBjYW1lIG91dCBvbiB0aGUgY2hhcmlvdCBvZiB0aGUgZmlyc3QgZ2xlYW0gb2YgbGlnaHQs
YW5kIHB1cnN1ZWQnbXkndm95YWdlIHRocm91Z2nndGhlIHdpbGRlcm5lc3NlcyBvZiB3b3JsZHMnbGVhdmluZyBteSB0cmFjayBvbiBtYW55IGEnc3RhciBhbmQncGxhbmV0Ln==
R2l2ZSBtZSB0aGUgc3RyZW5ndGggbGlnaHRseSB0byBiZWFyIG15IGpveXMgYW5kIHNvcnJvd3Mu
R2l2ZSBtZSB0aGUgc3RyZW5ndGggdG8gbWFrZSBteSBsb3ZlIGZydWl0ZnVsIGluIHNlcnZpY2Uu
R2l2ZSBtZSB0aGUkc3RyZW5ndGkkbmV2ZXIkdG8kZGlzb3duIHRoZSBwb29yIG9yIGJlbmQkbXkka25lZXMkYmVmb3JlIGluc29sZW50IG1pZ2h0Lk==
R2l2ZSBtZSB0aGUgc3RyZW5ndGggdG8gcmFpc2UgbXkgbWluZCBoaWdoIGFib3ZlIGRhaWx5IHRyaWZsZXMu
QW5kIGdpdmUgbWUgdGhlIHN0cmVuZ3RoIHRvIHN1cnJlbmRlciBteSBzdHJlbmd0aCB0byB0aHkgd2lsbCB3aXRoIGxvdmUu
SWYgdGhlIGRheSBpcyBkb25lLG==
aWYgYmlyZHMgc2luZyBubyBtb3JlLB==
aWYgdGhlIHdpbmQgaGFzIGZsYWdnZWQgdGlyZWQsIHRoZW4gZHJhdyB0aGUgdmVpbCBvZiBkYXJrbmVzcyB0aGljayB1cG9uIG1lLG==
ZXZlbiBhcyB0aG91IGhhc3Qgd3JhcHQgdGhlIGVhcnRoIHdpdGggdGhlIGNvdmVybGV0IG9mIHNsZWVwIGFuZCB0ZW5kZXJseSBjbG9zZWQgdGhlIHBldGFscyBvZiB0aGUgZHJvb3BpbmcgbG90dXMgYXQgZHVzay7=
RnJvbSB0aGUgdHJhdmVsbGVyLJ==
d2hvc2Ugc2FjayBvZiBycm92aXNpb25zIGlzIGVtcHR5IGJlZm9yZSB0aGUgdm95YWdlIGlzIGVuZGVkLCB3aG9zZSBnYXJtZW50IGlzIHRvcm4gYW5kIGR1c3RsYWRlbiy=
d2hvc2Ugc3RyZW5ndGggaXMgZXhoYXVzdGVkLK==
cmVtb3ZlIHNoYW1lIGFuZCBwb3ZlcnR5LG==
YW5kIHJlbmV3IGhpcyBsaWZlIGxpa2UgYSBmbG93ZXIgdW5kZXIgdGhlIGNvdmVyIG9mIHRoeSBraW5kbHkgbmlnaHQu
V2hlbiBncmFjZSBpcyBsb3N0IGZyb20gbGlmZSwgY29tZSB3aXRoIGEgYnVyc3Qgb2Ygc29uZy4=
SSBrbm93IHRoYXQgdGhlIGRheSB3aWxsIGNvbWUgd2hlbiBteSBzaWdodCBvZiB0aGlzIGVhcnRoIHNoYWxsIGJlIGxvc3Qs
YW5kIGxpZmUld2lsbCB0YWtlIGl0cyBsZWF2ZSBpbiBzaWxlbmNlLCBkcmF3aW5nIHRoZSBsYXN0IGN1cnRhaW4lb3ZlciBteSBleWVzLl==
WWV0IHN0YXJzIHdpbGwgd2F0Y2ggYXQgbmlnaHQs
YW5kIG1vcm5pbmcgcmlzZSBhcyBiZWZvcmUs
YW5kIGhvdXJzIGhlYXZlIGxpa2Ugc2VhIHdhdmVzIGNhc3RpbmcgdXAgcGxlYXN1cmVzIGFuZCBwYWlucy6=
V2hlbiBJIHRoaW5rIG9mIHRoaXMgZW5kIG9mIG15IG1vbWVudHMs
dGhlIGJhcnJpZXIgb2YgdGhlIG1vbWVudHMgYnJlYWtzIGFuZCBJIHNlZSBieSB0aGUgbGlnaHQgb2YgZGVhdGggdGh5IHdvcmxkIHdpdGggaXRzIGNhcmVsZXNzIHRyZWFzdXJlcy7=
UmFyZSBpcyBpdHMgbG93bGllc3Qgc2VhdCw=
cmFyZSBpcyBpdHMgbWVhbmVzdCBvZiBsaXZlcy5=
VGhpbmdzIHRoYXQgSSBsb25nZWQgZm9yIGluIHZhaW6gYW5kIHRoaW5ncyB0aGF0IEkgZ290LS0tbGV0IHRoZW0gcGFzcy6=
TGV0IG1lIGJ1dCB0cnVseSBwb3NzZXNzIHRoZSB0aGluZ3MgdGhhdCBJIGV2ZXIgc3B1cm5lZCBhbmQgb3Zlcmxvb2tlZC6=
LS0tLUV4Y2VycHQgZnJvbSBUYWdvcmUgJkdpdGFuamFsaSJ=


    上面这堆base64编码就是隐写后的结果,下面来获取隐写的内容。
    根据隐写原理可知,1个‘=’结尾的base64编码串有2bit隐写信息,2个‘=’结尾的有4bit信息,无‘=’的没有隐写
信息。脚本如下
decrypt.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/usr/bin/env python
import re

path = './encodeFile.txt'  
b64char = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
with open(path, 'r')as f:
	cipher = [i.strip() for i in f.readlines()]
plaintext = ''
for i in cipher:
	if i[-2] == '=':  # There are 4-bit hidden info while end with two '='
		bin_message = bin(b64char.index(i[-3]))[2:].zfill(4)
		plaintext += bin_message[-4:]
	elif i[-1] == '=':  # There are 2-bit hidden info while end with one '='
		bin_message = bin(b64char.index(i[-2]))[2:].zfill(2)
		plaintext += bin_message[-2:]
plaintext = re.findall('.{8}', plaintext)  # 8bits/group
plaintext = ''.join([chr(int(i,2)) for i in plaintext])
print plaintext
得到结果:

image1

隐写的内容正是这首诗歌的出处《吉檀迦利》

最后附上隐写脚本
encrypt.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/usr/bin/env python
import base64

with open('./gitanjali.txt', 'r')as f:
	data = [i.strip() for i in f.readlines()]
base64Data = [base64.b64encode(i) for i in data]
b64char = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'

msg = 'Gitanjali'  # 隐写内容,注意隐写内容不应超过最大隐写bit数
msg_bit = ''.join([bin(ord(i))[2:].zfill(8) for i in msg])
offset = 0
new_data = []
for i in base64Data:
	if i[-2]=='=':  # There are 4-bit hidden info while end with two '='
		offset = int(msg_bit[:4],2)
		i = i.replace(i[-3], b64char[b64char.index(i[-3])+offset])
		msg_bit = msg_bit[4:]
	elif i[-1]=='=':  # There are 2-bit hidden info while end with one '='
		offset = int(msg_bit[:2],2)
		i = i.replace(i[-2], b64char[b64char.index(i[-2])+offset])
		msg_bit = msg_bit[2:]
	new_data.append(i+"\n")
with open('./encodeFile.txt', 'w')as f:
	f.writelines(new_data)
参考链接:https://www.tr0y.wang/2017/06/14/Base64steg/index.html