Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
182 views
in Technique[技术] by (71.8m points)

How to find the domain is whether HTTP or HTTPS (with or without WWW) using PHP?

I have million (1,000,000) domains list.

+----+--------------+--------------------------+
| Id | Domain_Name  |       Correct_URL        |
+----+--------------+--------------------------+
|  1 | example1.com | http://www.example1.com  |
|  2 | example2.com | https://exmple2.com      |
|  3 | example3.com | https://www.example3.com |
|  3 | example4.com | http://example4.com      |
+----+--------------+--------------------------+
  • ID and Domain_Name column is filled.
  • Correct_URL column is empty.

Question : I need to fill the Correct_URL column.

The problem I face is how do I find the prefix part before the domain. It may either http:// or http://www. or https:// or https://www.

How do I find correctly what is from above 4 using PHP? Please note that I need to run code to all 1,000,000 domains.... So I am looking at a fastest way to check it...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There isn't really any way other than making an HTTP request to each of the possibilities and see if you get a response.

While you assert "It may either http:// or http://www. or https:// or https://www.", real world domains may provide zero, some or all or those (as well as various others) and they may respond to requests with OKs or redirects or authentication errors, etc.

HTTP and HTTPS are not attributes of a web application; they are communication protocols handled by the endpoint (the web server, or an application firewall, etc.).

As with any network communications, one must probe the host ("www" is the host in this case), and the port (not necessarily, but most commonly) port 80 and 443 respectively. This probing is a shout, then you wait and see if there is a service listening on the other side.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...