A web-based application to semi-automate site map creation

I started working with GraphViz this month and have created a web-based application that converts tab delimitted text files into diagrams. The sole purpose for the application at this point is to turn site inventories or IA hierarchies into clickable site maps like this.

Before you ask why I bothered to do this, I'll give a little history. Immediately after writing the article Automating Diagrams with Visio for Boxes and Arrows I began to see that I didn't want to draw circles, boxes, lines, etc. anymore. That hacky process I used served its purpose. But over the past year I have learned to let databases and scripting languages to the heavy work we normally do in applications like Excel, e.g. content inventories, site architecture (capturing page/node data and parent child relationships). But I still have the need to work with Excel or plain text files for some of the smaller sites I work on outside of my day job. So I still do the site architecture in Excel and now I can do the diagramming in GraphViz.

So try out the app and let me know if you are doing anything similar or see other uses for this thing.

UPDATE: Added a few options including hierachical or radial layout, box or circle shapes, fill or no fill, and shape and font coloring options so you can now create diagrams like this.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Boxes and arrows

Brilliant, Michael! No need to explain your motivation: I hate drawing circles, lines, boxes, and arrows, too. Absolutely hate it. The Visio vs. Illustrator debates seem silly to me - neither is efficient. I felt like I was spending the whole day pushing boxes and arrows around on the screen when using such tools.


Note that you could link pages in your sitemap tool to HTML wireframes, so that during development you have a coordination between the two key IA deliverables.

Cool deal.

Great idea, James. Thanks for

Great idea, James. Thanks for the feedback. I suppose for content inventories the boxes can link to URL of existing pages and for sitemaps of pages that are being developed, they can link to wireframes.

Content Inventory

Great tool! A map seems to fit with many others tools like a lego block --- I could see how content inventory could be an extension of your tool.

Jeff Veen wrote an article at Adaptive Path concerning the usage and format of content inventory. In the article, he includes an excel file as a sample format of an inventory. I use the same concepts (Link ID (x.x.x), Name, Link, Type, Owner, ROT, Notes) in an html document created in Dreamweaver. I find it easier to create content inventory in HTML as opposed to Excel for increased portability and ease of changing fonts, cell color background, etc.

Now I regret my decision to use HTML because of the inability to easily modify the files format to tab delimited to use with the sitemap generator.

quam

Strip HTML in an editor?

Doing inventories in HTML does make for difficult maninpulation of data. You could easily run your HTML file through an editor to strip out the html and then replace table cells (assuming you're using a table or something) with tabs. I know this would be easy in BBEdit at least. Not sure how to do it in HomeSite. In vi, you can do this very easily I would think.

HomeSite editing easy

HomeSite version 4.5.2 and higher have very good regular expression scripting capabilities that make easy removal of tags and ease the restructuring of well formated HTML a breaze.

Source data?

Hi,

This looks nice. I was wondering what you planned to use for source data, to construct the tab-delim files?

I've recently been looking at the same kind of situation in my own app, and although I've not sorted out a good tree layout yet I've got import from filesystem going, so the directory tree is walked to create nodes.

Cheers,
Danny.

-----------
http://dannyayers.com

Source data

I originally started doing this type of conversion of hierarchies to sitemaps using content lists (page inventories?) created in Excel. The content lists were inventories of pages to be created when developing a site. I've since gone to a model where my team maintains page inventories in a database and we use that data with graphviz to create sitemaps. On small projects, I still enter the site hierarchy as a list of pages in Excel.

There are a lot of applications that can be imagined, such as the one you're suggesting of creating the data file by walking your filesystem tree. Another example is to create data files out of Apache referrer logs. We do that to visualize information seeking patterns, similar to the experiments that James Spahr has been doing with OmniGraffle. Am interested in your approach. Are you trying to visualize the files existing on your filesystem for some reason>

Source Data

Thanks.
I must admit I've never used inventories of the form you describe (not worked on any big sites either, for that matter) so that's new to me.

My application (Ideagraph is an information manager, and one of the kinds of information is html docs. The main user interface is graphic (nodes & arcs). I've recently been doing quite a lot of work in the html/site parts of it, since I lost my Dreamweaver CD and have 3 sites to update...

Anyhow, I want to be able to use data from a lot of different sources, and it looks like I should add Excel site inventories to the list!

Thanks again,
Danny.

-----------
http://dannyayers.com

Semantic Weblog

dc

This tool would be good if
1. it could automatically generate the text file by crawling the web site.
2. if the text file was provided (eg by the user), the tool could find
conflicts between the links in the site and the text file, such as node
B claimed to be a child of node A, but no link between A and B.

dc.

For now I'm settling for OK

> 1. it could automatically generate the text file by crawling the web site.

Yes, crawling would be nice, but I only scripted this thing to work with data files exported from Excel at the moment. That was my immediate need. Like I said, at work my sysadmin writes scripts to read Apache logs and our content database. Surely the little experiment I did could be scripted to crawl a site, but I don't have the bandwidth to learn how to do that at the moment. I'm more an inelegant hacker and in no way consider myself a programmer. Maybe in the future. This was just a little proof of concept thing for me.

> 2. if the text file was provided (eg by the user), the tool could find
conflicts between the links in the site and the text file, such as node
B claimed to be a child of node A, but no link between A and B.

Not sure what is meant here by the "if the text file was provided (e.g. by the user)". Do you want to map a users mental model of the site. Again, this can be done partially if you look at user-session specific data in a log. Can you explain more what you mean here?

++ if the text file was provi

++ if the text file was provided by the user, the tool
++ could find conflicts between the links in the site and
++ the text file, such as node B claimed to be a child of
++ node A, but no link between A and B.

+ Not sure what is meant here by the "if the text file was
+ provided (e.g. by the user)". Do you want to map a users
+ mental model of the site. Again, this can be done partially
+ if you look at user-session specific data in a log. Can you
+ explain more what you mean here?

Now, the site map is defined by the text file.
So errors and omissions in the text file are propagated as
errors and omissions in the sitemap.

It's also work to generate this text file, though if you
already create the text file in Excel, it's not really
_extra_ work for you.

But to generate a site map by doing no work, run a file system
crawler, egs
File::Find in Perl,
find in UNIX,
dir /b/s in DOS,
etc
or a web crawler that follows links you get the list
to generate the text file.
[A simple crawler: tree.pl]

If you want a site map of a web site that does not
exist, your set-up is great.

If you already have a site and want to see what's there,
and can't remember if you clicked on all the links by
the time you get to link 200, an automated tool is required.

dc.

I see

I see what you're getting at. I have not worked with many sites that are not database-driven, so I have not had to find work-arounds to figure out how to generate data files, e.g. spidering to see what's on a site. I'm going to look at tree.pl and see if I can do an experiment crawling a site and generating an upload file. Know any similar scripts in PHP?

Source data?

(2nd attempt at comment)

I was wondering what you have in mind to use as source data to make diagrams out of.
(directory listings? link trees? or is there something you can get from particular authoring tools?)

-----------
http://dannyayers.com

Semantic Weblog

Source data again

Danny, check my reply to your first comment. Is there something more you're looking for?

Oops!

Sorry, I mistook the next item below for a different post, and didn't scroll to see it was the comments...

Update: radial diagrams, circles and colors

Added a few options including hierachical or radial layout, box or circle shapes, fill or no fill, and shape and font coloring options.

Displaying diagrams on the web

hello there!
I was wondering if you could help me if possible?
I am a final year student and as part of my project i need to develop node diagrams to be displayed on a web page. The problem is however, that the node diagrams will have to be based on information that is stored in a database, the web page is connected to the database. I need the node diagrams to be constructed automatically simply by touching a button on the web page.
Is there any way i can use graphviz to do this or is there code that i code write to generate the graphs automatically? Any help would be greatly apprieciated. The database is in mysql and the web page is written in html and php.
Thankyou