代码片段记录14

1️⃣[shuffle list]

shuffle list可以使用random的shuffle函数,亦即:

1
2
l=[1,2,3,4]
shuffle(l) # in place operation

而想要shuffle两个对应list,也即等长且一一对应的list,则可以:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# borrow from stackoverflow
import random

def shuffle(a,b):
assert len(a) == len(b)
start_state = random.getstate()
random.shuffle(a)
random.setstate(start_state)
random.shuffle(b)

a = [1,2,3,4,5,6,7,8,9]
b = [11,12,13,14,15,16,17,18,19]
shuffle(a,b)
print(a) # [9, 7, 3, 1, 2, 5, 4, 8, 6]
print(b) # [19, 17, 13, 11, 12, 15, 14, 18, 16]

2️⃣[inverse tensor]

Pytorch目前还不支持步进为负的情况,因此不能使用类似Python的l[::-1]的方法reverse tensor。
一种解决方案:

1
2
3
4
5
6
7
import torch

inv_idx = torch.arange(tensor.size(0)-1, -1, -1).long()
# or equivalently torch.range(tensor.size(0)-1, 0, -1).long()
inv_tensor = tensor.index_select(0, inv_idx)
# or equivalently
inv_tensor = tensor[inv_idx]

3️⃣[GRU initialization]

1
2
3
4
5
def _gru_init(self):   # use orthogonal seems better
nn.init.orthogonal_(self.word_RNN.weight_ih_l0.data) #没有data不行,会报leaf variable in-place错误,可能weight_ih_l0不是parameter
nn.init.orthogonal_(self.word_RNN.weight_hh_l0.data)
self.word_RNN.bias_ih_l0.data.zero_()
self.word_RNN.bias_hh_l0.data.zero_()

4️⃣[sort counter]

需求:统计document的句子个数的分布,并按照长度顺序排列。

1
2
3
n_sents=[len(sentences) for sentences in documents]
n_lengths=Counter(n_sents)
n_lengths=sorted(n_lengths.items())