News:

Yahoo Groups closing on Dec 14th 2019

Main Menu

Export to external formats

Started by Wilson Logan, July 12, 2014, 10:31:28 AM

Previous topic - Next topic

danb

For CSV we might be able to just escape just the commas in the html, then in Excel you would unescape them.

And it looks like HTML could be viewable in Excel:
http://stackoverflow.com/questions/9999713/html-text-with-tags-to-formatted-text-in-an-excel-cell

Eric Chaney

#16
Like another user here I haven't used the program in over a couple of years, but for us the target was exporting several hundred thousand messages to a MYSQL database.

The biggest problem we had with the format PG Offline saved to was when several emails were sent on the same day PG would ignore the timestamps on them and post newest to oldest for the day thus making the replies appear before the topic started.

The second problem we had was with pictures and attachments. PG simply wouldn't download them at all, but we just ended up doing those manually.

I think like most other users we simply wanted our data to be easily categorized, recategorized, and manipulated at will. That sometimes isn't possible with a simple excel export so we chose to write our own export program to move it to mysql.

Thank you for all the hard work on the program. I am thrilled that you guys are still working on it and even more so that you cared enough to ask the end user for input. Well done guys.

danb

Quote from: Eric Chaney on July 12, 2014, 11:44:55 PM
Like another user here I haven't used the program in over a couple of years, but for us the target was exporting several hundred thousand messages to a MYSQL database.

Thanks for the feedback Eric!

PG4 data currently lives in a SQLite database.  PG4 doesn't really even need to perform conversion to other database since there are tools that do this already. So my initial thought is exports to other databases might not be needed with PG4, versus the older PG.  I'm not against exporting to other DB formats but only if no other tool is workable.

On that thought, it might be wise for me to look at how some other DB tools export HTML to a CSV file. 

Hopefully we can all narrow down exporting to a few large use-cases that covers the majority.

File, photo and attachment downloads are working now but could use more testing.

gfineman

As you probably know, Yahoo does not require that you be a Yahoo member in order to be a member of a listserv. That is, you can send and receive emails with the listserv without a Yahoo ID or password. I support a number of community groups that each have about one third of their membership without Yahoo logins.  This means that they cannot use the Yahoo search function. I use PGO to create a shadow copy of the emails for each of these groups. I then use a custom program (in Word VBA) to extract these messages from the PGO database and create separate HTM files for each one. These are then put on a website where they are searchable using the Zoom search engine. This lets users find plumbers, roofers, etc.

To implement and export to a coma delimited file or Excel, it might be useful to be able to only append the new messages since the last export. One of my larger groups already has 160,000 messages.

danb

For CSV I was thinking we would convert the html message portion to base64 encoding. 

Then in Excel you can decode base64.  http://thirdpinion.com/main/programming/encode-and-decode-base64-within-ms-excel/

The base64 encoded data would look like this for example:

AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKissLS4vMDEyMzQ1Njc4
OTo7PD0+P0BBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWltcXV5fYGFiY2RlZmdoaWprbG1ub3Bx
cnN0dXZ3eHl6e3x9fn+AgYKDhIWGh4iJiouMjY6PkJGSk5SVlpeYmZqbnJ2en6ChoqOkpaanqKmq
q6ytrq+wsbKztLW2t7i5uru8vb6/wMHCw8TFxsfIycrLzM3Oz9DR0tPU1dbX2Nna29zd3t/g4eLj
5OXm5+jp6uvs7e7v8PHy8/T19vf4+fr7/P3+/w==

Eric Chaney

I think if you use base 64 you will have a lot of frustrated users that are going to expect plain text or UTF-8 and give up after the export isn't what they expect. Think older folks or those not so tech savvy, like the type that run Yahoo groups.

BigGuy

Just adding my vote to the CSV solution.  It's very flexible.  The CSV flavor doesn't matter all that much to me although I would prefer something that Excel can easily parse by default without a lot of fuss.  Something like the RFC-4180 standard with header would do the trick I think.

GMac999

Quote from: danb on July 12, 2014, 03:27:45 PM
What specific functions would an email program provide that PG4 lacks?  Trying to determine if it would be easier to build those features into PG4 versus building the export piece to an email app.

I'm using both Yahoo groups, Gmail Groups and list servers.  A lot of times information is needed between one or the other and if it could be exported, it would be a lot easier.  Right now it's cut and paste to outlook, go back get parts you missed and paste them back in.

Wilson Logan

Gmail groups?  You mean Google groups?

danb

Quote from: GMac999 on July 15, 2014, 06:36:00 PM
Quote from: danb on July 12, 2014, 03:27:45 PM
What specific functions would an email program provide that PG4 lacks?  Trying to determine if it would be easier to build those features into PG4 versus building the export piece to an email app.

I'm using both Yahoo groups, Gmail Groups and list servers.  A lot of times information is needed between one or the other and if it could be exported, it would be a lot easier.  Right now it's cut and paste to outlook, go back get parts you missed and paste them back in.

How would exporting help, wouldn't it still be a cut-n-paste operation?

martin_lists

Wasn't the previous .mdb file format compatible with Excel? I seem to recall being able to open my saved archive files that way.  Anything that can be opened in Excel or similar would be fine.

plaforte

Sorry for the late answer I am on a trip around the planet. I copy here "Stonehollow" part of message

"I would prefer Excel CSV (comma delimited) - that way you get both a "simple" text file and Excel capability if desired."

This in my opinion would be a great option. I am using PG for researching among 70,000 posts in a forum. I need to categorize the content of each post and sometimes one post may cover 5 categories and so on. Excel is for me very easy, I can add columns for categories and do search and filter, all that easily.

Thank you for considering this added functionality.

Pierre

sanpats

Quote from: danb on July 13, 2014, 03:43:53 PM
For CSV I was thinking we would convert the html message portion to base64 encoding. 

Then in Excel you can decode base64.  http://thirdpinion.com/main/programming/encode-and-decode-base64-within-ms-excel/

The base64 encoded data would look like this for example:

AAECAwQFBgcICQoLDA0ODxAREhMUFRYXGBkaGxwdHh8gISIjJCUmJygpKissLS4vMDEyMzQ1Njc4
OTo7PD0+P0BBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWltcXV5fYGFiY2RlZmdoaWprbG1ub3Bx
cnN0dXZ3eHl6e3x9fn+AgYKDhIWGh4iJiouMjY6PkJGSk5SVlpeYmZqbnJ2en6ChoqOkpaanqKmq
q6ytrq+wsbKztLW2t7i5uru8vb6/wMHCw8TFxsfIycrLzM3Oz9DR0tPU1dbX2Nna29zd3t/g4eLj
5OXm5+jp6uvs7e7v8PHy8/T19vf4+fr7/P3+/w==

This is nice idea, but I also love to have an option to go plain text.

How long can we expect this feature?

txpigeon

I know I'm late to the party (as usual).  I used another program to export a couple of groups to .csv format to import into Excel (2003).  I ran into a problem on many messages getting out of whack.  Turns out they were larger than 64KB and Excel couldn't handle them in a single cell, throwing things off after each occurrence.  I also had to break a large group into multiple files or there would be too many rows.  I don't really see a need for PGO to export as there are free programs to do the job.  If anything a simple csv or text file should give the most flexibility.

I'm no longer on Yahoo, so won't be archiving any more groups, but it sure came in handy while trying to preserve everything I could when NEO struck.  From what I've heard, NEO is still a long ways from fixed, friendly, and useable.

Duane

nightstitcher

Personally, I like Access (mdb), but anything (Excel, comma-delimited etc.) that Access could import would be fine.

SMF spam blocked by CleanTalk