Class URLUtil


  • public final class URLUtil
    extends Object
    A collection of utility functions (static methods) operating on URLs.

    Work with any hierarchical URLs. Does not work with opaque URLs, except for a few functions which work with "jar:" URLs.

    Note that, for these few functions, the path of "jar:" URL (e.g. jar:http://www.foo.com/bar/baz.jar!/COM/foo/Quux.class) is everything after "!/", including the leading "/".

    • Field Detail

      • EMPTY_LIST

        public static final URL[] EMPTY_LIST
        A ready-to-use empty list of URLs.
    • Method Detail

      • isFileURL

        public static boolean isFileURL​(URL url)
        Returns true if specified URL is a file: URL, otherwise returns false
      • isJarURL

        public static boolean isJarURL​(URL url)
        Returns true if specified URL is a jar: URL, otherwise returns false
      • isDataURL

        public static boolean isDataURL​(URL url)
        Returns true if specified URL is a data: URL, otherwise returns false
      • urlToFile

        public static File urlToFile​(URL url)
        Converts a file: URL to a File.

        On Windows, this function converts a "file:" URL having a host (other than "localhost") to an UNC filename. For example, it converts "file://foo/bar/gee.txt" to "\\foo\bar\gee.txt".

        Parameters:
        url - the URL to be converted
        Returns:
        an absolute File or null if url cannot be converted to a File (for example, because url is not a file: URL)
        See Also:
        isFileURL(java.net.URL), FileUtil.fileToURL(java.io.File), urlOrFile(java.lang.String)
      • urlToURI

        public static URI urlToURI​(URL url)
        Similar to java.net.URL.toURI() except that this utility will not throw a java.net.URISyntaxException if the URL spec contains illegal characters such as spaces. In such case, special efforts are made to nevertheless return a URI equivalent to specified URL.
        Parameters:
        url - URL to be converted
        Returns:
        converted URI or null if this really cannot be done
      • urlOrFile

        public static URL urlOrFile​(String path,
                                    boolean checkAbsolute,
                                    boolean allowDir,
                                    URL baseURL)
        Returns an URL created from specified path. First, this convenience function attempts to convert specified path to an URL. If this fails, specified path is considered to be the name of an existing file or directory. If this filename conforms to specified requirements (checkAbsolute, allowDir), it is converted to an URL using FileUtil.fileToURL(java.io.File).
        Parameters:
        path - external form of an URL or the filename of an existing file or directory.

        If the path contains newline characters, everything after the first newline character, including this character, is ignored.

        The reason for this is that Web browsers such as Firefox seems to append the title of the Web page after its URL.

        checkAbsolute - if true, when path is a filename, path must be absolute or this function will return null
        allowDir - if true, when path is a filename, path is allowed to be not only the path of a file but also the path of a directory
        baseURL - which base URL to use to resolve path when its a relative URL. May be null.
        Returns:
        an URL or null if specified path cannot be converted to an URL given specified requirements
      • sameRoot

        public static boolean sameRoot​(URL url1,
                                       URL url2)
        Returns true if specified URLs have the same root.
        See Also:
        getRoot(java.net.URL)
      • getRoot

        public static URL getRoot​(URL url)
        Returns the root of specified URL.

        Example: returns "http://java.sun.com/" for "http://java.sun.com/docs/index.html".

        Example: returns "jar:http://www.foo.com/bar/baz.jar!/" for "jar:http://www.foo.com/bar/baz.jar!/COM/foo/Quux.class".

        Parameters:
        url - a hierachical or "jar:" URL
        Returns:
        root of specified URL
      • isAncestorOf

        public static boolean isAncestorOf​(URL ancestorURL,
                                           URL url)
        Tests whether first specified URL is an ancestor of second specified URL.
        Parameters:
        ancestorURL - a hierachical or "jar:" URL
        url - a hierachical or "jar:" URL
        Returns:
        true if ancestorURL is equal to or is an ancestor directory of url; false otherwise.
      • getParent

        public static URL getParent​(URL url)
        Returns the parent of specified URL, if any. Returned URL has a path which ends with '/'.

        Examples:

        • Returns "http://java.sun.com/docs/" for "http://java.sun.com/docs/index.html".
        • Returns null for "http://java.sun.com/".
        • Returns "jar:http://www.foo.com/bar/baz.jar!/COM/foo/" for "jar:http://www.foo.com/bar/baz.jar!/COM/foo/Quux.class".
        Parameters:
        url - a hierachical or "jar:" URL
        Returns:
        parent of specified URL or null for root URLs.
        See Also:
        URIComponent.getRawParentPath(String, boolean)
      • getRawPath

        public static String getRawPath​(URL url)
        Returns the raw (that is, possibly containing %HH escapes) path, if specified URL has a path.

        Example: returns "/index.html" for "http://www.acme.com/index.html".

        Example: returns "/COM/foo/Quux.class" for "jar:http://www.foo.com/bar/baz.jar!/COM/foo/Quux.class".

        Parameters:
        url - a hierachical or "jar:" URL
        Returns:
        the raw path or null if specified URL has no path (this is not consistent with URL.getPath which returns the empty string in such case)
      • getRawBaseName

        public static String getRawBaseName​(URL url)
        Returns the raw (that is, possibly containing %HH escapes) basename part of the path, if specified URL has a path.

        Example: returns "index.html" for "http://www.acme.com/index.html".

        Example: returns "png" for "http://www.acme.com/icons/png/".

        Example: returns "Quux.class" for "jar:http://www.foo.com/bar/baz.jar!/COM/foo/Quux.class".

        Parameters:
        url - a hierachical or "jar:" URL
        Returns:
        basename or null if specified URL has no path (this is not consistent with URL.getPath which returns the empty string in such case)
        See Also:
        URIComponent.getRawBaseName(java.lang.String)
      • getRawExtension

        public static String getRawExtension​(URL url)
        Returns the raw (that is, possibly containing %HH escapes) extension of the path, if specified URL has a path. The extension does not include a leading dot '.'.

        Example: returns "html" for "http://www.acme.com/index.html".

        Example: returns null for "http://www.acme.com/icons/png/".

        Example: returns "class" for "jar:http://www.foo.com/bar/baz.jar!/COM/foo/Quux.class".

        Parameters:
        url - a hierachical or "jar:" URL
        Returns:
        extension or null if specified URL has no path (this is not consistent with URL.getPath which returns the empty string in such case)
        See Also:
        URIComponent.getRawExtension(java.lang.String)
      • setRawExtension

        public static URL setRawExtension​(URL url,
                                          String extension)
        Changes the extension of specified URL to specified extension.
        Parameters:
        url - a hierachical or "jar:" URL
        extension - new extension. Assumed to have been quoted using URIComponent.quotePath(java.lang.String). May be null which means: remove the extension.
        Returns:
        an URL identical to url except that its extension has been changed or removed.

        Returns same URL if specified URL has no path or its path ends with '/'.

        See Also:
        URIComponent.setRawExtension(java.lang.String, java.lang.String)
      • getRawUserName

        public static String getRawUserName​(URL url)
        Returns the raw (that is, possibly containing %HH escapes) user name, if a user info is found in specified URL. Returns null otherwise.
      • getRawUserPassword

        public static String getRawUserPassword​(URL url)
        Returns the raw (that is, possibly containing %HH escapes) user password, if a user info is found in specified URL. Returns null otherwise.
      • setRawUserInfo

        public static URL setRawUserInfo​(URL url,
                                         String userName,
                                         String password)
        Changes the user info of specified URL to specified user info.
        Parameters:
        url - a hierachical or "jar:" URL
        userName - new username. Assumed to have been quoted using URIComponent.quoteUserInfo(java.lang.String). May be null, which means: remove user info.
        password - new password. Assumed to have been quoted using URIComponent.quoteUserInfo(java.lang.String). May be null, which means: password not specified.
        Returns:
        an URL identical to url except that its user info has been changed or removed.
      • getRawRelativePath

        public static String getRawRelativePath​(URL url,
                                                URL base)
        Returns the path of specified URL relative to specified base URL.

        More precisely returns relativePath such that new URL(base, relativePath) equals url.

        Parameters:
        url - a hierarchical or "jar:" URL
        base - another hierarchical or "jar:" URL
        Returns:
        a relative path possibly followed by the query and fragment components of url or URL.toExternalForm if url or base have no path or if url and base don't have the same root
        See Also:
        URIComponent.getRawRelativePath(java.lang.String, java.lang.String)
      • toDisplayForm

        public static String toDisplayForm​(URL url)
        Same as java.net.URL.toExternalForm except that returned string may contain non-ASCII characters and that, if specified URL contains a password, the characters of this password are replaced by '*'.

        Example: returns ftp://jjc%40xx.com:******@ftp.xx.com/pub/My%20report.doc for ftp://jjc%40xx.com:s%25same@ftp.xx.com/pub/My%20report.doc.

        Parameters:
        url - a hierarchical URL possibly having a fragment and a query string
        Returns:
        display form or URL.toExternalForm if specified URL is opaque ("jar:" URLs are opaque).
      • toShortLabel

        public static String toShortLabel​(URL url,
                                          int maxLength)
        Same as toLabel(java.net.URL) except that the returned string is made shorter than specified length (when possible). This function is useful to display the recently opened URLs in the File menu of an application.
      • toShortDisplayForm

        public static String toShortDisplayForm​(URL url,
                                                int maxLength)
        Same as toDisplayForm(java.net.URL) except that the returned string is made shorter than specified length (when possible). This function is useful to display the recently opened URLs in the File menu of an application.

        Supports "jar:" URLs, but returns other opaque URLs as is.

      • exists

        public static boolean exists​(URL url,
                                     boolean followRedirects,
                                     int timeout)
                              throws IOException
        Tests whether specified URL corresponds to an existing resource.

        This method treats "file:" URLs as a special, optimized, case.

        Parameters:
        url - the URL to be tested
        followRedirects - if true, follow redirections, ("301: Moved Permanently", "302: Temporary Redirect") including very common http to https ones.
        timeout - specifies both connect and read timeout values in milliseconds. 0 means: infinite timeout. A negative value means: default value.
        Returns:
        true if specified URL corresponds to an existing resource; returns false otherwise.
        Throws:
        IOException - if there is an I/O problem
      • lastModified

        public static long lastModified​(URL url,
                                        boolean followRedirects,
                                        int timeout)
                                 throws IOException
        Returns the date of the resource having specified URL.

        This method treats "file:" URLs as a special, optimized, case.

        Parameters:
        url - the URL to be tested
        followRedirects - if true, follow redirections, ("301: Moved Permanently", "302: Temporary Redirect") including very common http to https ones.
        timeout - specifies both connect and read timeout values in milliseconds. 0 means: infinite timeout. A negative value means: default value.
        Returns:
        A number of milliseconds since January 1, 1970 GMT. If specified URL does not exist or if this date is unknown, a number which is negative or null.
        Throws:
        IOException - if there is an I/O problem
      • checkHttpConnection

        public static URLConnection checkHttpConnection​(URLConnection connection,
                                                        boolean followRedirects)
                                                 throws IOException
        Establish specified HTTP connection and check whether it returns a "200 OK".

        No effect when specified connection is not a HttpURLConnection (e.g. a JarURLConnection).

        Parameters:
        connection - HTTP connection to be checked
        followRedirects - if true, follow redirections, ("301: Moved Permanently", "302: Temporary Redirect") including very common http to https ones.
        Throws:
        IOException - when specified HTTP connection returned a code other than "200 OK". The message of the exception contains information about what happened.
      • loadBytes

        public static byte[] loadBytes​(URL url,
                                       boolean followRedirects,
                                       int timeout)
                                throws IOException
        Loads the content of an URL containing binary data.
        Parameters:
        url - the URL of the binary data
        followRedirects - if true, follow redirections, ("301: Moved Permanently", "302: Temporary Redirect") including very common http to https ones.
        timeout - specifies both connect and read timeout values in milliseconds. 0 means: infinite timeout. A negative value means: default value.
        Returns:
        the loaded bytes
        Throws:
        IOException - if there is an I/O problem
      • copyFile

        public static int copyFile​(String srcLocation,
                                   String dstLocation)
                            throws IllegalArgumentException,
                                   IOException
        Copies the contents of specified URL to specified "file:" URL.
        Parameters:
        srcLocation - URL of the source file in string form. If relative, this location is relative to the current working directory.
        dstLocation - URL of the destination file in string form. If relative, this location is relative to the current working directory.
        Throws:
        IllegalArgumentException - if srcLocation cannot be parsed as an URL or if dstLocation cannot be parsed as a "file:" URL.
        IOException - if an I/O problem occurs
      • loadString

        public static String loadString​(URL url,
                                        String charset,
                                        boolean followRedirects,
                                        int timeout)
                                 throws IOException
        Loads the content of an URL containing text.
        Parameters:
        url - the URL of the text resource
        charset - the IANA charset of the text source if known; specifying null means detect it using the content type obtained from the connection
        followRedirects - if true, follow redirections, ("301: Moved Permanently", "302: Temporary Redirect") including very common http to https ones.
        timeout - specifies both connect and read timeout values in milliseconds. 0 means: infinite timeout. A negative value means: default value.
        Returns:
        the loaded String
        Throws:
        IOException - if there is an I/O problem
      • contentTypeToCharset

        public static String contentTypeToCharset​(String contentType)
        Returns the value of the charset parameter possibly found in specified content type. For example, returns "utf-8", when passed "text/html; charset=UTF-8"
        Parameters:
        contentType - a content type (AKA media type) possibly having a charset parameter
        Returns:
        value of the charset parameter (lower case) if any; null otherwise
      • contentTypeToMedia

        public static String contentTypeToMedia​(String contentType)
        Parses a content type such as "text/html; charset=ISO-8859-1" and returns the media type (for the above example "text/html").
        Parameters:
        contentType - the content type to be parsed
        Returns:
        the media type (lower case) if parsing was successful; null otherwise.
      • normalizeContentType

        public static String normalizeContentType​(String contentType,
                                                  String defaultCharset)
        Returns a normalized string form for specified content type.

        Example: returns text/html;charset=iso-8859-1 for text/html; charset="ISO-8859-1".

        Parameters:
        contentType - content type to be normalized
        defaultCharset - charset to add as a parameter to the content type when this parameter is absent. May be null.
        Returns:
        normalized string form for specified content type; null if specified content type is malformed.
        See Also:
        sameContentType(java.lang.String, java.lang.String, java.lang.String)
      • sameContentType

        public static boolean sameContentType​(String ct1,
                                              String ct2,
                                              String defaultCharset)
        Tests whether specified content types are identical.

        Examples:

        • Returns true for text/html; charset=ISO-8859-1 and text/html;charset="iso-8859-1".
        • Returns false for text/html; charset=ISO-8859-1 and text/html.
        Parameters:
        ct1 - content type to be tested
        ct2 - content type to be tested
        defaultCharset - charset to add as a parameter to a content type when this parameter is absent. May be null.
        Returns:
        true if specified content types are identical; false otherwise
        See Also:
        normalizeContentType(java.lang.String, java.lang.String)
      • openConnectionNoCache

        public static URLConnection openConnectionNoCache​(URL url)
                                                   throws IOException
        Similar to url.openConnection except that the accessed resource may not be a cached copy.
        Parameters:
        url - URL for which an URLConnection must be opened
        Throws:
        IOException - if URLConnection cannot be opened
        See Also:
        openStreamNoCache(URL)
      • openStreamNoCache

        public static InputStream openStreamNoCache​(URL url)
                                             throws IOException
        Similar to url.openStream except that the accessed resource may not be a cached copy.
        Parameters:
        url - URL for which an input stream must be opened
        Returns:
        opened input stream
        Throws:
        IOException - if the input stream cannot be opened
        See Also:
        openConnectionNoCache(URL)
      • openConnectionUseCache

        public static URLConnection openConnectionUseCache​(URL url)
                                                    throws IOException
        Similar to url.openConnection except that the accessed resource may be a cached copy.
        Parameters:
        url - URL for which an URLConnection must be opened
        Throws:
        IOException - if URLConnection cannot be opened
        See Also:
        openStreamUseCache(URL)
      • openStreamUseCache

        public static InputStream openStreamUseCache​(URL url)
                                              throws IOException
        Similar to url.openStream except that the accessed resource may be a cached copy.
        Parameters:
        url - URL for which an input stream must be opened
        Returns:
        opened input stream
        Throws:
        IOException - if the input stream cannot be opened
        See Also:
        openConnectionUseCache(URL)