How to Extract URLs from a Sitemap#
Using External Libraries#
The easiest way to obtain URLs from a sitemap is to parse the sitemap XML file using the lxml
library. Example:
Python | |
---|---|
1 2 3 4 5 6 7 8 9 |
|
This will print all the URLs from the sitemap:
https://example.com
https://example.com/page1
https://example.com/page2
...
Using Existing Methods#
Alternatively, you can simplify the process and use the existing methods in IndexNow for Python to retrieve and parse the sitemap XML file:
Python | |
---|---|
1 2 3 4 5 6 7 8 |
|
The end result will be the same:
https://example.com
https://example.com/page1
https://example.com/page2
...