Quantcast
Channel: Piwik Forums - Support & Bugs
Viewing all articles
Browse latest Browse all 13117

Importing multiple IIS logs from multiple servers automated, unattended

$
0
0
First off, I'm new here but Piwik is great work. Thanks for making it.

I've installed and configured Piwik on Ubuntu 12 for the main purpose of parsing logs FTPd to me from a web server farm I do not have access to. I extracted one of their uploaded archives and it expands to a folder with dozens of files from multiple servers each, so I have hundreds of files to import into Piwik.

Note that these are log files for the same domain spread via load balancer to multiple web servers, not separate websites.

I have been able to successfully run python import_logs.py a single file when I reference it by its exact name.

I've been through the docs and every related forum thread which specifies multiple server logs and/or parsing IIS logs, but haven't found the answers to these two questions:

1) Is piwik's import_logs.py script or the DB schema smart enough to only import log data once? In other words, if I mistakenly run import_logs.py on the same file twice, is my data duplicated?

2) Is Piwik capable of dealing with log files from multiple web servers all serving the same domain? (eg a load balanced cluster of web servers)

3) Does someone have an example script using find | xargs or some such, which would find all the log files and pipe them to import_logs.py? I've been unable to get this working right; only a one-off import of a single file.

4) As I watch import_logs.py run on a single file, I see:

116386 lines parsed, 74800 lines recorded, 36 records/sec (avg), 200 records/sec (current)
116386 lines parsed, 74800 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116386 lines parsed, 74800 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116386 lines parsed, 74800 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116386 lines parsed, 74800 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116705 lines parsed, 75000 lines recorded, 36 records/sec (avg), 200 records/sec (current)
116705 lines parsed, 75000 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116705 lines parsed, 75000 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116705 lines parsed, 75000 lines recorded, 36 records/sec (avg), 0 records/sec (current)
116705 lines parsed, 75000 lines recorded, 36 records/sec (avg), 0 records/sec (current)

Does this output indicate i have a problem? Or just a very slow server? (It's a t1.micro at Amazon with no usage other than running import_logs.py but I think I will need to greatly boost it to a much larger instance size given how slow this appears to be running)


Thanks for your answers.

Viewing all articles
Browse latest Browse all 13117

Trending Articles