2023年1月

华强夫人陈岚抖音上首次直播被吐槽

作者: 梁凡
时间: 2023-01-26
分类: about output
评论

2022年12月19日向华强夫人陈岚首次在抖音上直播，由于拒吃里面的产品，被网友吐槽。这不很正常吗，她喜欢吃的东西，你们买的起吗？抖音直播本来就是面对下沉市场的。

有道词典python调用代码

作者: 梁凡
时间: 2023-01-25
分类: about python
评论

import requests
url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
# 反爬
header = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36"
}
def youdao(s):
    dat = {
        "i": s,  # 需要输入翻译的内容
        "from": "AUTO",
        "to": "AUTO",
        "smartresult": "dict",
        "client": "fanyideskweb",
        "doctype": "json",
        "version": "2.1",
        "keyfrom": "fanyi.web",
        "action": "FY_BY_CLICKBUTTION"
    }
    # 把结果以json的形式输出
    resp = requests.post(url, headers=header, data=dat).json()
    return resp['translateResult'][0][0]['tgt']
print(youdao('cat'))

清洗带html的朗文词典

作者: 梁凡
时间: 2023-01-25
分类: about english
评论

import re
import time
file1 = open(r'..\data\朗文双解清洗'+str(time.time())+'.csv', 'a',encoding='utf-8')
with open(r'..\data\朗文双解.csv',encoding='utf-8') as file:
    conten = file.readlines()
    for lin in conten:
        word=re.findall(r'^.*\t', lin)
        en = re.findall(r'<font class=L_SYL>(.*?)</span>', lin)
        str_first1 = re.sub('<font color=black>',"|",str(en))
        str_first2 = re.sub('<span class=L_POS>',"|",str(str_first1))
        str_first3 = re.sub('</font>',"|",str(str_first2))
        str_first4 = re.sub('<.*?>',"",str(str_first3))
        # print(str_first2)
        gg = str(word)+'\t'+str_first4 +'\n'
        file1.write(gg)
        print(word,str_first4)

mdx格式剑桥词典清洗

作者: 梁凡
时间: 2023-01-25
分类: about english
1 条评论

import re
import time
file1 = open(r'..\data\剑桥双解清洗'+str(time.time())+'.csv', 'a',encoding='utf-8')
with open(r'..\data\剑桥双解清洗.txt',encoding='utf-8') as file:
    conten = file.readlines()
    for lin in conten:
        word=re.findall(r'<font style="font-weight:bold;">(.*?)</font>', lin)
        en = re.findall(r'<font style="margin-right:1px;">(.*?)\\n', lin)
        str_first1 = re.sub('<font style="color:navy;margin-left:12pt;" >',"|",str(en))
        str_first2 = re.sub('<.*?>',"",str(str_first1))
        print(str_first2)
        gg = str(word)+'\t'+str_first2 +'\n'
        file1.write(gg)

清洗朗文文本格式的词典

作者: 梁凡
时间: 2023-01-25
分类: about english
评论

import re
import time
newtime = time.strftime("%Y%m%d%H%M%S", time.localtime(time.time()))
file1 = open(r'..\data\朗文双解清洗'+str(newtime)+'.csv', 'a',encoding='utf-8')
with open(r'..\data\朗文双解.txt',encoding='utf-8') as file:
    conten = file.readlines()
    for lin in conten:
        word=re.findall(r'[123459789]\..*', lin)
        en = re.findall(r'★.*', lin)
        print(en,word)
        gg = str(en)+'\n'+str(word) +'\n'
        file1.write(gg)

华强夫人陈岚抖音上首次直播被吐槽

有道词典python调用代码

清洗带html的朗文词典

mdx格式剑桥词典清洗

清洗朗文文本格式的词典

最新文章

分类

归档