PG Offline

PG Offline => Version 4 Beta => Topic started by: Wilson Logan on July 12, 2014, 10:31:28 AM

Title: Export to external formats
Post by: Wilson Logan on July 12, 2014, 10:31:28 AM
Hi Guys,

As many of you will be aware, PGO has no export to external format e.g. Word, RTF, Excel, etc.

Dan & I are discussing introducing such a facility but we are unsure of what exactly it is people need to do with PGO data externally.

If you can use this thread to tell us what you're doing & how you would want an 'export to external' function to work, we would be grateful.

Thanks,

Wilson.
Title: Re: Export to external formats
Post by: prochargedmopar on July 12, 2014, 02:24:42 PM
Wow,
Seems like forever since I used this program.
Had to be somewhere in the early 2000's.
And you guys are still going at it!
Title: Re: Export to external formats
Post by: GMac999 on July 12, 2014, 02:49:36 PM
I would prefer a text editor of some type, word/rtf/notepad.  A lot of conversations extend over many emails.  The groups I'm in have you trim your posts, so the only way to get all the suggestions is through cut and paste.
Title: Re: Export to external formats
Post by: BigGuy on July 12, 2014, 03:10:47 PM
This would be an extremely welcomed feature.

I think any of the formats mentioned Word, RTF, Excel all would be excellent.  But perhaps a simple and universal format such as plain text would be more useful for the purpose of archiving messages for reference which is my main need.

If I had to pick only one format it would probably be plain text.  My second choice would be Excel.  The first is efficient for quick and simple archiving and the second a good way to separate fields for database needs.

Of course all of these deal with the messages themselves and not attachments, or other sections of the groups such as files.  If included in the export facility I think they would have to be exported separately as storing embeded files in any of these formats (the ones that may be able to even do that) is likely not a good solution.

Having a link field to files in Excel and throwing the files into a folder on export would be super cool.   :D

Thank you for seeking input!
Title: Re: Export to external formats
Post by: casstk on July 12, 2014, 03:15:54 PM
We use the export feature to summarize answers, recommendations and suggestions, so RTF and TXT would be important. The ability to import the messages into another email program would even be more helpful because that would give us access to many other functions, but we understand if that is not possible for now (or ever).
Title: Re: Export to external formats
Post by: danb on July 12, 2014, 03:27:45 PM
Quote from: casstk on July 12, 2014, 03:15:54 PM
We use the export feature to summarize answers, recommendations and suggestions, so RTF and TXT would be important. The ability to import the messages into another email program would even be more helpful because that would give us access to many other functions, but we understand if that is not possible for now (or ever).

What specific functions would an email program provide that PG4 lacks?  Trying to determine if it would be easier to build those features into PG4 versus building the export piece to an email app.
Title: Re: Export to external formats
Post by: john13598 on July 12, 2014, 04:49:41 PM
I would love to see the messages reformatted into html format where I can display the message history on my website. Now, I suppose you could make it word compatible (as a .doc file and not a .docx file) and then it can be saved in html format. Also it would be nice if one could strip out all of the advertising and stuff that is in the messages that are not relevant to the actual content of the message itelf (for example yahoo stuff and the like). I have sample files that I have been working on to reformat them into something that I can put up on my website and allow users to search for information that they might need. There are a total of 8200+ messages starting at message 1, the day of inception of the group.

By the way, I have to thank you guys for supplying such a nice tool. It is a life saver for our group. Your support is also the best too. Thank you so much for continuing your work on this nice tool.
Title: Re: Export to external formats
Post by: stonehollow on July 12, 2014, 05:00:19 PM
I would prefer Excel CSV (comma delimited) - that way you get both a "simple" text file and Excel capability if desired.  I mostly want it for a backup in case Yahoo! decides to go away, or if the group wants to move somewhere else.
Title: Re: Export to external formats
Post by: Denos on July 12, 2014, 05:14:29 PM
Greetings Wilson and others. I too have been using PGO for years and miss it when Yahoo changes format.
For my several lists what I try to do regularly  is post:

Statistics about who took part in a given discussion thread. Who posted, initiated etc.
I love the digest function - when it works.


I guess the ability to export easily to WORD would help greatly.

Dennis aka Denos
Title: Re: Export to external formats
Post by: Gasmann on July 12, 2014, 06:51:28 PM
An export function may be useful now and then. I would prefer the CSV or perhaps RTF format, not .doc or .docx. In Europe many people use other text-editors which are better and cheaper than MS Word.
IMHO improving PGO's search functions would be more useful.
Title: Re: Export to external formats
Post by: tcr on July 12, 2014, 08:27:54 PM
For me the "problem" I would try to solve is one of integrating email archives from various sources. As background, I manage a community association with around 850 users whose email addresses include around 150 domains -- I don't know how many independent ISPs and mail servers. They use many different mail readers/composers on many different devices (phones, tablets, laptops, desktops, etc.). Many don't even have (or want) Yahoo accounts. To make matters more complicated, we used Yahoo for 13 years and recently moved to Google because changing Yahoo policies made it much harder to manage such a large group. These providers have their own business models and are not particularly interested in allowing users to merge information across vendors.

So, when a user asks, "where do I find a good plumber for kitchen repairs?, "What is the latest El Niño forecast for the coming winter?", etc., answers have to be collected, often from archives, and merged from diverse sources. PGO itself is very powerful for accessing, searching, and digesting Yahoo group email material -- and you guys are to be commended for such a rich, well-designed, and high-performance tool. However, it stands as one part of a whole and the "export" feature for me would allow merging PGO results with results from other email, web, db, etc. sources.

I think the ability to export individual messages as (a) simple text or (b) HTML-tagged text and (c) as HTML-formatted .eml files would be great -- other common email file formats would likely also be useful. I think the ability to export digests derived from PGO searches similarly would be very useful. Generally HTML can be imported into Word and Excel if needed. I personally use Thunderbird with IMAP access to my email services and these have nice ways to do simple editing and merging. I also use EMACS as a text editor, which is very powerful to edit special formats (like HTML) with user-defined macros and E-Lisp functions.

In the grandest long-term view, PGO might become a way to integrate and manage diverse collections of structured texts (email being a very important example) across different sources and specialties.

Tom Rindfleisch
Title: Re: Export to external formats
Post by: t2000kw on July 12, 2014, 08:55:00 PM
Wilson,

If you want to cover the most bases with the lowest amount of work involved, an export to a comma delimited file would work with almost all spreadsheet software. While I use Excel and get MS Office through my workplace through a Microsoft Employee Purchase Program for about $10, not everyone has Excel. But OpenOffice, ApacheOffice, and LibreOffice all are free and have a spreadsheet program as part of their suites.

For me, Excel would be best, but I can get by with the simpler almost universal comma delimited (separated) format (.csv files). That would allow anyone to use the file as a database of sorts, much more powerful than a word processor file.

If you also add a word processor format, keep in mind that the older .doc format can be read by the free office suites and MS Word as well. Also, RTF is almost universal, too.

Which would be preferred will no doubt depend on how many columns of data you have to export and import. A word processor file would only likely be useful for a limited number of columns and characters, even in landscape mode. A spreadsheet would be useful for any number of columns or characters your program could possibly export.

I believe you can already import the data file into MS Access. If so, it might be able to be imported into the free office suites as a database file. Not sure on that as databases are not my strong suit, though I can set up simple ones.

If you need a vote, CSV (comma delimited (or comma separated) values) would get my vote since everyone could have access to it with the free office suites available. Even those of us using Excel can easily import a CSV file into the program.

It's been a while since I've had a need to use PGO, but now that Yahoo keeps us from downloading a member list of more than the first 1,000 members, will yours handle the full number of members? If I remember, the way it works, it should.

Donald
Title: Re: Export to external formats
Post by: Wilson Logan on July 12, 2014, 10:14:29 PM
Hi Donald,

Not sure on that number of members thing. I'm currently locked out so I can't test it.

Will let you know tomorrow.

Cheers,

Wilson.
Title: Re: Export to external formats
Post by: danb on July 12, 2014, 10:19:18 PM
Thanks for all the good info everyone.

So if we could to CSV and possibly .eml we might cover most needs. 

For CSV, since the message content is HTML I think I would store the HTML as an encoded string (solves the problem of commas) and Excel users would need to have a function in Excel to decode, which shouldn't be a big deal.

For .eml, the only potential problem is that Yahoo does not provide the full email address for message and it is truncated.  So some email programs might balk at 'bad' email addresses.

Possibly an export to individual html files might be useful for some?
Title: Re: Export to external formats
Post by: Wilson Logan on July 12, 2014, 10:57:19 PM
Hi Dan,

Just a small point on the hex coded string issue. I've been contacted by a PGO user trying to convert the PGO Message Content from hex string to ASCII and I thought it'd be easy in Excel.

Not so much. It'll certainly need a bit of VB or something to plugin.

I did find a proposed solution (an Excel macro) but users reported that it did not work.

Maybe my Google-fu isn't up to the job but I couldn't find a working macro for Excel to do this.

BTW what about Pipe (|) separated values. Yes, I know commas are more normal but all you have to do is tell Excel what value you're using as a separator and hey presto.


Cheers,

Wilson.   
Title: Re: Export to external formats
Post by: danb on July 12, 2014, 11:31:42 PM
For CSV we might be able to just escape just the commas in the html, then in Excel you would unescape them.

And it looks like HTML could be viewable in Excel:
http://stackoverflow.com/questions/9999713/html-text-with-tags-to-formatted-text-in-an-excel-cell
Title: Re: Export to external formats
Post by: Eric Chaney on July 12, 2014, 11:44:55 PM
Like another user here I haven't used the program in over a couple of years, but for us the target was exporting several hundred thousand messages to a MYSQL database.

The biggest problem we had with the format PG Offline saved to was when several emails were sent on the same day PG would ignore the timestamps on them and post newest to oldest for the day thus making the replies appear before the topic started.

The second problem we had was with pictures and attachments. PG simply wouldn't download them at all, but we just ended up doing those manually.

I think like most other users we simply wanted our data to be easily categorized, recategorized, and manipulated at will. That sometimes isn't possible with a simple excel export so we chose to write our own export program to move it to mysql.

Thank you for all the hard work on the program. I am thrilled that you guys are still working on it and even more so that you cared enough to ask the end user for input. Well done guys.
Title: Re: Export to external formats
Post by: danb on July 13, 2014, 12:14:32 AM
Quote from: Eric Chaney on July 12, 2014, 11:44:55 PM
Like another user here I haven't used the program in over a couple of years, but for us the target was exporting several hundred thousand messages to a MYSQL database.

Thanks for the feedback Eric!

PG4 data currently lives in a SQLite database.  PG4 doesn't really even need to perform conversion to other database since there are tools that do this already. So my initial thought is exports to other databases might not be needed with PG4, versus the older PG.  I'm not against exporting to other DB formats but only if no other tool is workable.

On that thought, it might be wise for me to look at how some other DB tools export HTML to a CSV file. 

Hopefully we can all narrow down exporting to a few large use-cases that covers the majority.

File, photo and attachment downloads are working now but could use more testing.
Title: Re: Export to external formats
Post by: gfineman on July 13, 2014, 02:58:27 PM
As you probably know, Yahoo does not require that you be a Yahoo member in order to be a member of a listserv. That is, you can send and receive emails with the listserv without a Yahoo ID or password. I support a number of community groups that each have about one third of their membership without Yahoo logins.  This means that they cannot use the Yahoo search function. I use PGO to create a shadow copy of the emails for each of these groups. I then use a custom program (in Word VBA) to extract these messages from the PGO database and create separate HTM files for each one. These are then put on a website where they are searchable using the Zoom search engine. This lets users find plumbers, roofers, etc.

To implement and export to a coma delimited file or Excel, it might be useful to be able to only append the new messages since the last export. One of my larger groups already has 160,000 messages.
Title: Re: Export to external formats
Post by: danb on July 13, 2014, 03:43:53 PM
For CSV I was thinking we would convert the html message portion to base64 encoding. 

Then in Excel you can decode base64.  http://thirdpinion.com/main/programming/encode-and-decode-base64-within-ms-excel/

The base64 encoded data would look like this for example:

AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKissLS4vMDEyMzQ1Njc4
OTo7PD0+P0BBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWltcXV5fYGFiY2RlZmdoaWprbG1ub3Bx
cnN0dXZ3eHl6e3x9fn+AgYKDhIWGh4iJiouMjY6PkJGSk5SVlpeYmZqbnJ2en6ChoqOkpaanqKmq
q6ytrq+wsbKztLW2t7i5uru8vb6/wMHCw8TFxsfIycrLzM3Oz9DR0tPU1dbX2Nna29zd3t/g4eLj
5OXm5+jp6uvs7e7v8PHy8/T19vf4+fr7/P3+/w==
Title: Re: Export to external formats
Post by: Eric Chaney on July 13, 2014, 04:54:15 PM
I think if you use base 64 you will have a lot of frustrated users that are going to expect plain text or UTF-8 and give up after the export isn't what they expect. Think older folks or those not so tech savvy, like the type that run Yahoo groups.
Title: Re: Export to external formats
Post by: BigGuy on July 15, 2014, 03:07:24 AM
Just adding my vote to the CSV solution.  It's very flexible.  The CSV flavor doesn't matter all that much to me although I would prefer something that Excel can easily parse by default without a lot of fuss.  Something like the RFC-4180 standard with header would do the trick I think.
Title: Re: Export to external formats
Post by: GMac999 on July 15, 2014, 06:36:00 PM
Quote from: danb on July 12, 2014, 03:27:45 PM
What specific functions would an email program provide that PG4 lacks?  Trying to determine if it would be easier to build those features into PG4 versus building the export piece to an email app.

I'm using both Yahoo groups, Gmail Groups and list servers.  A lot of times information is needed between one or the other and if it could be exported, it would be a lot easier.  Right now it's cut and paste to outlook, go back get parts you missed and paste them back in.
Title: Re: Export to external formats
Post by: Wilson Logan on July 15, 2014, 07:32:10 PM
Gmail groups?  You mean Google groups?
Title: Re: Export to external formats
Post by: danb on July 15, 2014, 07:38:40 PM
Quote from: GMac999 on July 15, 2014, 06:36:00 PM
Quote from: danb on July 12, 2014, 03:27:45 PM
What specific functions would an email program provide that PG4 lacks?  Trying to determine if it would be easier to build those features into PG4 versus building the export piece to an email app.

I'm using both Yahoo groups, Gmail Groups and list servers.  A lot of times information is needed between one or the other and if it could be exported, it would be a lot easier.  Right now it's cut and paste to outlook, go back get parts you missed and paste them back in.

How would exporting help, wouldn't it still be a cut-n-paste operation?
Title: Re: Export to external formats
Post by: martin_lists on July 16, 2014, 09:10:30 PM
Wasn't the previous .mdb file format compatible with Excel? I seem to recall being able to open my saved archive files that way.  Anything that can be opened in Excel or similar would be fine.
Title: Re: Export to external formats
Post by: plaforte on July 17, 2014, 03:22:22 PM
Sorry for the late answer I am on a trip around the planet. I copy here "Stonehollow" part of message

"I would prefer Excel CSV (comma delimited) - that way you get both a "simple" text file and Excel capability if desired."

This in my opinion would be a great option. I am using PG for researching among 70,000 posts in a forum. I need to categorize the content of each post and sometimes one post may cover 5 categories and so on. Excel is for me very easy, I can add columns for categories and do search and filter, all that easily.

Thank you for considering this added functionality.

Pierre
Title: Re: Export to external formats
Post by: sanpats on August 02, 2014, 03:46:22 PM
Quote from: danb on July 13, 2014, 03:43:53 PM
For CSV I was thinking we would convert the html message portion to base64 encoding. 

Then in Excel you can decode base64.  http://thirdpinion.com/main/programming/encode-and-decode-base64-within-ms-excel/

The base64 encoded data would look like this for example:

AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKissLS4vMDEyMzQ1Njc4
OTo7PD0+P0BBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWltcXV5fYGFiY2RlZmdoaWprbG1ub3Bx
cnN0dXZ3eHl6e3x9fn+AgYKDhIWGh4iJiouMjY6PkJGSk5SVlpeYmZqbnJ2en6ChoqOkpaanqKmq
q6ytrq+wsbKztLW2t7i5uru8vb6/wMHCw8TFxsfIycrLzM3Oz9DR0tPU1dbX2Nna29zd3t/g4eLj
5OXm5+jp6uvs7e7v8PHy8/T19vf4+fr7/P3+/w==

This is nice idea, but I also love to have an option to go plain text.

How long can we expect this feature?
Title: Re: Export to external formats
Post by: txpigeon on August 03, 2014, 03:04:24 AM
I know I'm late to the party (as usual).  I used another program to export a couple of groups to .csv format to import into Excel (2003).  I ran into a problem on many messages getting out of whack.  Turns out they were larger than 64KB and Excel couldn't handle them in a single cell, throwing things off after each occurrence.  I also had to break a large group into multiple files or there would be too many rows.  I don't really see a need for PGO to export as there are free programs to do the job.  If anything a simple csv or text file should give the most flexibility.

I'm no longer on Yahoo, so won't be archiving any more groups, but it sure came in handy while trying to preserve everything I could when NEO struck.  From what I've heard, NEO is still a long ways from fixed, friendly, and useable.

Duane
Title: Re: Export to external formats
Post by: nightstitcher on December 03, 2014, 03:32:18 AM
Personally, I like Access (mdb), but anything (Excel, comma-delimited etc.) that Access could import would be fine.