What features do you want in new format?
-
it takes 2 months for engineer to implement each, so during those 2 months he could do:PDF, MP3, PNG instead etc, etc.
it is very time consuming and demanding project, which is why nobody but PA can read these archives currently.
-
Ah. From the sound of it, I thought the implementations were close to complete. I guess the first thing I’d like to see then is the wz jpeg compression, just for compatibility. I’m less concerned with a unique format right now as portability issues would be a hindrance. For another unique archive format to catch on, I think you’d need to release an open source portable command line version at the very least before it would be truly useful among other users.
-
Ah. From the sound of it, I thought the implementations were close to complete. I guess the first thing I’d like to see then is the wz jpeg compression, just for compatibility. I’m less concerned with a unique format right now as portability issues would be a hindrance. For another unique archive format to catch on, I think you’d need to release an open source portable command line version at the very least before it would be truly useful among other users.
compatibility with only WinZip though, no other utility, and certainly no free or open source utility… so what is the difference really between PAF, IZEHRBLAH, and ZIPX, when only two products support extracting ZIPX archives.
Problem is that ZIP itself, due to old nature of format, will never be anywhere as good as modern format, and whatever extensions are added to ZIPX not only that they cant be as good as new format, they will also not be compatible with other products either.
Who knows, maybe PAF will be open format and maybe there will be free extraction tools, eventually anyway :)
-
WinZip has a large userbase, regardless of whether we like it or not. 7-zip can handle most of ZIPX, with the exception of jpeg compression, but, Igor has stated he would implement some form of it if he saw a need.
ZIP by its nature is just a container format. So, you’re right, it doesn’t matter much what you call it. I’m just thinking about trying to capture the largest audience possible.
I’m by no means anti-PAF. heh. I’d love to see a better format. After seeing so many archive formats come and go, I know it takes time for any format to catch on if at all.
-
WinZip has a large userbase, regardless of whether we like it or not. 7-zip can handle most of ZIPX, with the exception of jpeg compression, but, Igor has stated he would implement some form of it if he saw a need.
ZIP by its nature is just a container format. So, you’re right, it doesn’t matter much what you call it. I’m just thinking about trying to capture the largest audience possible.
I’m by no means anti-PAF. heh. I’d love to see a better format. After seeing so many archive formats come and go, I know it takes time for any format to catch on if at all.
yeah, but there are much bigger differences to be made with new, proper format… you cant just put something into zip and get good results, it doesnt work that way.
compress folder with zipx lzma and with 7zip lzma, and notice that the difference in size could be up to 50%, simply because zip doesnt and will not ever have proper solid compression.
point with paf is that we can make something unique with it that nobody else has - compression for many popular formats… weather it becomes next biggest thing is something else, we plan to use it in our next backup utility as well, where nobody does any of these things and space savings will be significant.
-
but pretty much, everything on that list takes at least a month to implement for single engineer… thats why nobody has done it before (expensive) and why all the new formats are very much alike (variations of old).
-
point with paf is that we can make something unique with it that nobody else has - compression for many popular formats… weather it becomes next biggest thing is something else, we plan to use it in our next backup utility as well, where nobody does any of these things and space savings will be significant.
I totally agree. I wasn’t clear on how far reaching you wanted things to be.
-
I totally agree. I wasn’t clear on how far reaching you wanted things to be.
it is massive undertaking though, and it takes a while…
but it will be worth it for instance for backups, where we will be able to do 50% more efficient backups that rest of the current utilities :)
-
keep in mind we have been “researching” this for past 2 years, so we are not staring from 0 here…
things we did so far:
- ver 1 jpeg compression - 3x speed of wz jpeg with slightly better compression. Aim in PAF: maintain similar super speed while adding extra 20% compression compared to WZ Jpeg.
- ver 1 differential and versioning system - compared to leading Mozy backup service, we had 30% smaller differential backups (which means 30% faster, 30% more effcient, 30% less costly for bandwith and storage space), without strong compression implemented… Goal is to tie it together into PAF and enable stronger compression that would give us 50% gain total.
- Various multicore research into improving speed for other operations during compression, not just compression codec, that should add extra 20-30% speed improvements over current formats (with similar codecs).
- modification of lzma2 codec into pa-lzma, to fit our format better, and better overall performance (to be released as open source), this is 80% done as it is…
-
New formats idea site:
http://ideas.powerarchiver.com/ -
The sample size on your survey is so small, I would caution against taking it too seriously.
If you found a way to ask a broader audience what they want, I would be shocked if multi-volume support would be in the top five.
There may not be a better way to reach a broader audience, and, if so, having established voting, you probably have to act based on the suggestions you received. Still, I fear that in so doing you be will be spending a lot of effort on a feature that really won’t appeal to that many people. I suspect 5-7 of the other options would be more appealing, even if they are not ones I would use.
So if you try this again, you might explore ways of getting input from a larger sample.
-
no need to worry, while ideas site is interesting we still have our own schedule and goals to work with (which is getting better compression on things that are not compressible currently).
multi volume feature is really simple to implement, but we probably would not do it that way if we didnt get enough votes.
-
If it is simple, then that’s a compelling reason to do it.
What I often don’t know is how difficult it is to implement a new feature. Some might seem easily, but be difficult (or next to impossible) while others that seem difficult might be a snap.
-
If it is simple, then that’s a compelling reason to do it.
What I often don’t know is how difficult it is to implement a new feature. Some might seem easily, but be difficult (or next to impossible) while others that seem difficult might be a snap.
hardest things on that lists are new codecs for pdf, jpeg, mp3… thats both hard and time consuming. Everything else on that list will probably take less time all together (!) then building special jpeg compressor.
-
I suppose I would have thought special compression would be especially difficult. But not that much more difficult.
Thanks for letting us know.
At some point (and if not too complicated), could you explain the relative difficulties of building new compression from pds versus jpgs. I assume the former would be easier (more white space). Perhaps, though, that assumption is borne of ignorance.
-
#1 goal for the format: special codecs for file types that are most used currently yet can not be compressed by current tools.
basically this is compression for complicated (already compressed) file types.
so what you have to do is take file apart, and divide it into parts that can and that can not be compressed (which is why jpeg, png, docx, pdf that is compressed, can not be compressed further usually)… then compress the part that can be compressed with special codec designed for that format. All of this is done transparently to the user of course, and usually quite fast if done right.
But for each format (mp3, jpeg, png, pdf, docx, odt, etc), special codec is required. So there is a lot of development work to be done here. There are also no examples of such work, and only very few utilities do it - for instance, Stuffit has a lot of special compressors, but it is available only if you give out your credit card (no actual free trial), while most other utilities do not have anything but general codecs.
Reality is that most people compress things that are already compressed, so using zip, or rar or 7zip on most things people usually backup or send via email will not result in great savings, or sometimes savings at all. If you compress jpegs to send over email or upload somewhere or simply backup, you will not gain any compression. On the other hand, with special jpeg codec, you can expect 20%-30% gain on your full album of pictures.
For instance zip has one most common codec which is deflate. WinRar has single codec too.
Now .paf/pa/power will have 5-6 at least within next 2 years. So you can imagine how big the task is. But the gains are big too so it is worth it. -
difference between pdf and jpeg is that while for both, you have to develop special recompression algorithm, pdf uses deflate compression from zip and then its contents once opened up, can be compressed well with general codecs.
on the other hand, jpeg has many variations so first recompression has to be able to take apart various jpegs and then you have to build completely custom codec to compress that picture inside.
So basically it is double the work compared to PDF.
Advantage with PDF is that we can use a lot of the code to recompress other formats like PNG, DOCX, ODT, SWF, since they all use deflate to give them (weak) compression.
For instance, DOCX containts XML files inside that are compressed with deflate. However since they are compressed, if you try to compress them again, you will gain very little… but if you unravel weak compression and then apply stronger one, big gains are possible.
Here is test example.
1. Contract in DOCX format - 104 KB
2. DOCX compressed with 7zip - 95KB
3. DOCX recompressed properly - 64 KBSo thats 40% gain on DOCX file for instance. Imagine if you have many of them on your computer, or if your company sends many via email or backup service… Time and cost savings are quite significant here.
Now actual % gained is different for different formats and there are further optimizations possible (for instance detect pictures and text inside single file differently and compress them with their own codecs), but you can see how much potential this has.
Main thing here is that it has to be done seamlessly and it has to be fast, otherwise people will not use it. And then we come to #2 part of new format - multicore optimizations.
-
thanks!
-
I would have to say that Jpeg, and PDF Compression is a good idea! as your possibly aware I have brought Picture compression up a few times in the past with a good warrent for it.
But with a list aslong as the above there are so many other good things to choose from.
I have made some votes. :o)
-
i would like the new format to introduce some sort of protection so that if the archive is changed it tells you by whom and can stop this if you dont wnat it updated but would like people to extract files from it only