2009-01-24 8 views
49

¿Hay una biblioteca en python que funcione así?Resolución de una ruta de URL relativa a su ruta absoluta

>>> resolvePath("http://www.asite.com/folder/currentpage.html", "anotherpage.html") 
'http://www.asite.com/folder/anotherpage.html' 
>>> resolvePath("http://www.asite.com/folder/currentpage.html", "folder2/anotherpage.html") 
'http://www.asite.com/folder/folder2/anotherpage.html' 
>>> resolvePath("http://www.asite.com/folder/currentpage.html", "/folder3/anotherpage.html") 
'http://www.asite.com/folder3/anotherpage.html' 
>>> resolvePath("http://www.asite.com/folder/currentpage.html", "../finalpage.html") 
'http://www.asite.com/finalpage.html' 

Respuesta

85

Sí, hay urlparse.urljoin, o urllib.parse.urljoin para Python 3.

>>> try: from urlparse import urljoin # Python2 
... except ImportError: from urllib.parse import urljoin # Python3 
... 
>>> urljoin("http://www.asite.com/folder/currentpage.html", "anotherpage.html") 
'http://www.asite.com/folder/anotherpage.html' 
>>> urljoin("http://www.asite.com/folder/currentpage.html", "folder2/anotherpage.html") 
'http://www.asite.com/folder/folder2/anotherpage.html' 
>>> urljoin("http://www.asite.com/folder/currentpage.html", "/folder3/anotherpage.html") 
'http://www.asite.com/folder3/anotherpage.html' 
>>> urljoin("http://www.asite.com/folder/currentpage.html", "../finalpage.html") 
'http://www.asite.com/finalpage.html' 

para copiar y pegar:

try: 
    from urlparse import urljoin # Python2 
except ImportError: 
    from urllib.parse import urljoin # Python3 
+0

Para una RFC 3986 y el reemplazo cumple la norma Unicode, véase [ uritools] (http://pythonhosted.org/uritools/). – Marian

+0

Esto no funciona si el segundo componente es absoluto, lamentablemente. Por ejemplo, 'urljoin (" http://example.com/blah.html "," ./././ whoa.html ")' * does * elimina los puntos, mientras que 'urljoin (" http: // example .com/blah.html "," /./././ whoa.html ")' no. – obskyr

Cuestiones relacionadas