I am trying to scrape this website using BeautifulSoup and Regex. While doing so, I encountered a question which was having "double quotes" and I wanted to replace the "double quotes" and save it as a .txt file. But it is not replacing the "double quotes". We tried .replace() method but I failed. The code is as follows:
url = 'http://www.sanfoundry.com/operating-system-mcqs-process-scheduling-queue/'
r = requests.get(url)
soup = bs(r.content)
data = soup.find_all('div', {'class':'entry-content'})
data1 = data[0].text
pattern = r'^d{1,2}[.|)]([s|S].*)|(^[a-z])s.*)|^View Answers?(Answer:.*)'
#pattern = r'^d{1,2}[.|)]s*(.*)|(^[a-z])s.*)|^View Answers?(Answer:.*)'
reg = re.compile(pattern)
#with open(r'C:UsersdhvaniGoogle DrivePythonData Scrapingyb.txt', 'a') as f:
with open(r'C:UsersJeri_DabbaGoogle DrivePythonData Scrapingyb.txt', 'a') as f:
for i in data1.split('
'):
if reg.search(i).group(1):
y = reg.search(i).group(1)
y = y.replace('"', '')
f.write(y + "
")
When I checked the .txt file the "double quotes" was not replaced. What might be the problem?
I am new to python.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…