Saturday October 27, 2007
Scripting for Something in HTML Files
A script in Groovy for pulling all lines with HREF attributes from all HTML files within a folder, as well as all of its immediate subfolders. I call this script from a Java class, as explained in a previous blog entry. The result is that the Output window is then populated with all lines containing an HREF attribute:
Here's the script, could probably be a lot better, especially the part where the next level is found:
package demojavaapplication
class HelloWorld {
def basedir = '/home/geertjan/ijc/htmlfiles'
def text = []
void main(args) {
new File(basedir).eachFile { f->
if (f.isFile() && f.toString().endsWith("html")) {
writeTags(f)
} else if (!f.isFile()) {
basedir = f.toString()
new File(basedir).eachFile { fNext->
if (fNext.isFile() && fNext.toString().endsWith("html")) {
writeTags(fNext)
}
}
}
}
}
String writeTags(f) {
println "--------------"
println "File: " + f.getName()
f.eachLine {
ln -> if ( ln =~ 'href' ) {
text << "${ln}"
}
}
text.each{ println " Found: $it" }
text.clear()
}
}If someone can help to make this script more compact, I would be happy to hear about it.
Oct 27 2007, 01:10:04 PM PDT Permalink


