Download NetBeans!

20071027 Saturday October 27, 2007

Scripting for Something in HTML Files

A script in Groovy for pulling all lines with HREF attributes from all HTML files within a folder, as well as all of its immediate subfolders. I call this script from a Java class, as explained in a previous blog entry. The result is that the Output window is then populated with all lines containing an HREF attribute:

Here's the script, could probably be a lot better, especially the part where the next level is found:

package demojavaapplication
   
class HelloWorld {

  def basedir = '/home/geertjan/ijc/htmlfiles'
 
  def text = []

  void main(args) {

        new File(basedir).eachFile { f->

            if (f.isFile() && f.toString().endsWith("html")) {
            
                writeTags(f)
               
             } else if (!f.isFile()) {

                basedir =  f.toString()

                new File(basedir).eachFile { fNext->

                    if (fNext.isFile() && fNext.toString().endsWith("html")) {
                        
                        writeTags(fNext)
               
                    }

                }

             }              

        }

   }

   String writeTags(f) {

        println "--------------"

        println "File: " + f.getName()         

        f.eachLine {
            ln -> if ( ln =~ 'href' ) {
                text << "${ln}"
            }
        }

        text.each{ println "   Found: $it" }

        text.clear()

   }

}

If someone can help to make this script more compact, I would be happy to hear about it.

Oct 27 2007, 01:10:04 PM PDT Permalink