A little bit of information about my Piwik:
Site 1: (Static Web Server Hosted on IIS)
>5000 pages tracked (limited by config)
~8,000,000 unique visitors a month
~3,500,000 unique visitors a day
~80,000,000 page views a month
Site 2: (Dynamic application server hosted on Solaris)
334 applications/pages tracked
~3,000,000 unique visitors a month
~1,500,000 unique visitors a day
~8,000,000 page views a month
Site 3: (Static Web Server Hosted on IIS)
>5000 pages tracked (limited by config)
~1,000,000 unique visitors a month
~500,000 unique visitors a day
~3,000,000 page views a month
Site 4: (FTP Server Hosted on IIS)
115 pages tracked
~500,000 unique visitors a month
~200,000 unique visitors a day
~2,000,000 page views a month
>5,000,000 downloads a month
All loaded into one database:
Amount of data in Database: 6 months worth
MyISAM database size: ~45GB
Time to parse and archive 1 day of data: 3 hours
Time to parse and archive 1 month of data: 4 days
I created this server as a replacement for Urchin, it is:
Dual core (2.4GHz per core) VM
Red Hat Enterprise Linux 6.4
16GB memory
2 SAS disks: 200GB for DB and 80 GB for Piwik and System files.
During parsing and archiving, CPU 1 is completely pegged at 100%, CPU 2 hits about 50%
10GB memory for PhP, 4 GB for MySQL
I just got this up and running a few weeks ago and plan on back loading 1 year worth of data and maintaining at LEAST 1 year worth in DB to allow for accurate yearly statistics.
Limitation right now appears to be CPU util for log parsing. I only record about 600 lines/second. When site 1 has daily logs that are 30+ GB it takes a while. However, I will be adding 2 more cores when I make this server a multipurpose server (I wrote an FTP crawler/management program in PhP) and will be throwing a few more GB of memory at it as well. Disk is surprisingly compact and low util, iotop only showing peaks at about 6MB/s
My ONLY issue I am having is page searching, with a database of this size searching for whatever.html takes about 3 hours to turn up results, and during that time completely locks the session whereas Urchin took about 3 seconds. If anyone knows of a way to make site searching less painful, please let me know.
Site 1: (Static Web Server Hosted on IIS)
>5000 pages tracked (limited by config)
~8,000,000 unique visitors a month
~3,500,000 unique visitors a day
~80,000,000 page views a month
Site 2: (Dynamic application server hosted on Solaris)
334 applications/pages tracked
~3,000,000 unique visitors a month
~1,500,000 unique visitors a day
~8,000,000 page views a month
Site 3: (Static Web Server Hosted on IIS)
>5000 pages tracked (limited by config)
~1,000,000 unique visitors a month
~500,000 unique visitors a day
~3,000,000 page views a month
Site 4: (FTP Server Hosted on IIS)
115 pages tracked
~500,000 unique visitors a month
~200,000 unique visitors a day
~2,000,000 page views a month
>5,000,000 downloads a month
All loaded into one database:
Amount of data in Database: 6 months worth
MyISAM database size: ~45GB
Time to parse and archive 1 day of data: 3 hours
Time to parse and archive 1 month of data: 4 days
I created this server as a replacement for Urchin, it is:
Dual core (2.4GHz per core) VM
Red Hat Enterprise Linux 6.4
16GB memory
2 SAS disks: 200GB for DB and 80 GB for Piwik and System files.
During parsing and archiving, CPU 1 is completely pegged at 100%, CPU 2 hits about 50%
10GB memory for PhP, 4 GB for MySQL
I just got this up and running a few weeks ago and plan on back loading 1 year worth of data and maintaining at LEAST 1 year worth in DB to allow for accurate yearly statistics.
Limitation right now appears to be CPU util for log parsing. I only record about 600 lines/second. When site 1 has daily logs that are 30+ GB it takes a while. However, I will be adding 2 more cores when I make this server a multipurpose server (I wrote an FTP crawler/management program in PhP) and will be throwing a few more GB of memory at it as well. Disk is surprisingly compact and low util, iotop only showing peaks at about 6MB/s
My ONLY issue I am having is page searching, with a database of this size searching for whatever.html takes about 3 hours to turn up results, and during that time completely locks the session whereas Urchin took about 3 seconds. If anyone knows of a way to make site searching less painful, please let me know.