Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.4k views
in Technique[技术] by (71.8m points)

python - Extracting hidden values from a form with beautifulsoup

I'm trying to parse a page HTML response which is the following (no proper HTML page formatting head/body...etc) :

    <div class="modal-header">
        <button type="button" class="close" data-dismiss="modal"><img src=" /sites/all/themes/iprn_bootstrap/images/ico_close.png"></button>
        <h4 class="modal-title">Login</h4>
    </div>
    <div id="ajax-forms-messages"></div>
    <div class="modal-body"><form action="/user/signin?destination=user/signin" method="post" id="user-login-form" accept-charset="UTF-8"><div><div class="form-item form-item-name form-type-textfield form-group"><input placeholder="E-mail *" class="form-control form-text required" type="text" id="edit-name" name="name" value="" size="15" maxlength="60" /> <label class="control-label element-invisible" for="edit-name">E-mail <span class="form-required" title="This field is required.">*</span></label>
</div><div class="form-item form-item-pass form-type-password form-group"><input placeholder="Password *" class="form-control form-text required" type="password" id="edit-pass" name="pass" size="15" maxlength="128" /> <label class="control-label element-invisible" for="edit-pass">Password <span class="form-required" title="This field is required.">*</span></label>
</div><input type="hidden" name="form_build_id" value="form-lLpMWePGycFNEi-XgTGuRW-vy2lNBld-rtTYEKX7JHk" />
<input type="hidden" name="form_id" value="user_login_block" />
<div class="form-actions form-wrapper form-group" id="edit-actions"><button id="login-submit-btn" type="submit" name="op" value="Log in" class="btn btn-primary form-submit icon-before"><span class="icon glyphicon glyphicon-log-in" aria-hidden="true"></span>
 Log in</button>
</div><a href="/user/password" class="forget" title="Request new password">Forgot your password?</a>
    <div class="section white">Not registered? <a href="/user/registration" class="link-register">Register</a></div></div></form></div>

I'm trying to get the values of the "hidden" inputs, but I'm struggling with BeautifulSoup, as I'm trying:

soup.select_one('#form_build_id')['value']

however, that didn't work.

What's a more elegant way to extract both hidden "values"?

<input type="hidden" name="form_build_id" value="form-lLpMWePGycFNEi-XgTGuRW-vy2lNBld-rtTYEKX7JHk" />
<input type="hidden" name="form_id" value="user_login_block" />
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can simply search for all tags where type="hidden":

...
soup = BeautifulSoup(html, "html.parser")

for tag in soup.find_all("input", type="hidden"):
    print(tag["value"])

Output:

form-lLpMWePGycFNEi-XgTGuRW-vy2lNBld-rtTYEKX7JHk
user_login_block

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...