News:

Yahoo Groups closing on Dec 14th 2019

Main Menu

Exporting .pg4 to something else

Started by Jared, December 27, 2015, 10:10:32 PM

Previous topic - Next topic

Jared

Sorry for what must be a really basic question, but I've been searching the archives and can't figure it out.

I have mostly successfully connected to my groups and downloaded the messages, and I'd like to be able to export/convert them to a database type that I can manipulate separate from PG4.  Is there a way to do that?

The groups have been closed, so there is no need for ongoing exports, just one time.  Thanks in advance.

Wilson Logan

Hi Jared,

The database which PGO uses is called SQLite.  The messages are in a table called group_message.  You can use a tool called SQLite DataWizard

https://www.sqlmaestro.com/products/sqlite/datawizard/

to export the table to CSV and from there to any other format you like.


I'm curious, what is that you need to do with the messages that you can't do by having them in PGO?

I've spent 10 years making it the best way to read, search and organise Yahoo groups messages.

Is it:

1) You haven't used the program other than the download feature so you don't actually know what it can do.

2) You're a genius and will craft a better solution in your spare time.

3) There is some vital feature missing from PGO.


Inquiring minds want to know...

BR,

Wilson Logan.


P.S. if its 1) please watch this video  https://www.youtube.com/watch?v=e4PIumS33e0















Jared

Wilson, thank you for the fast reply!  I have just watched the video. The answer to your question is complicated, and involves elements of all three, excepting the part about me being a genius, since that is certainly not the case.

The main reason is that we're not planning to use your program as it is intended.  If the groups were still active, I would have been glad to find your software so that I could use it daily.  When Yahoo went "neo" we started hosting our own message board, which was a big setback for the users.  The Yahoo format was what folks had gotten used to, and there was an archive on the order of 200k messages that the message board lacked.

As for features that PG4 doesn't seem to have, and please correct me if it does: One of our groups was unfortunately not moderated for several years, so there is a swath of crap that needs to be removed to make the archive useful.  I'd like to execute that cleanup once.  It seems that each user who installs PG4 would download the messages and then still have the bad years?  Also, PG4 has been very buggy for me.  I'm using windows 7 and since I've installed the program, I get regular pop-ups for the "object" general error message, even after I close the program.  I know that software is a constant battle, but honestly, it's just not a battle that I'm excited about fighting, being that the end result isn't exactly what I want anyway.

Another major reason is trust.  We trusted Yahoo to be good stewards of our archive, and they failed when they made the neo changeover.  I've learned that it is probably best not to trust any third party when we really want to have the data.  Please don't take it personally, but I don't trust that your software will be available and usable for as long as I'd like for it to be.

I don't want to sound ungrateful for the work that you have done, because your software is a crucial step for me.

Jared

Also, I've installed the SQLite Data Wizard, but it looks like it may need a little more info about what's going on with the .pg4 file.  Should I change the extension, or use a particular set of import settings?  When I try to specify that the file is, for example, plain text, the Data Wizard is looking for a file with that extension.

Wilson Logan

#4
Hi Jared,

Thanks for your full reply.

I will go through your response and answer your points:


As for features that PG4 doesn't seem to have, and please correct me if it does: One of our groups was unfortunately not moderated for several years, so there is a swath of crap that needs to be removed to make the archive useful.  I'd like to execute that cleanup once.

>>>> The search and sort features of PGO coupled with the Delete function would be the best tool available to remove unwanted messages from your message archive.


  It seems that each user who installs PG4 would download the messages and then still have the bad years? 

>>>> No. Once you have downloaded the messages from Yahoo it does not need to be repeated by any user. You make the edits to the archive as you require and then you distribute the edited archive.

Choose File--> Export to Folder.  This creates a .pg4 archive file which you can distribute and which can be imported using File--> Import from Folder.

I have created a video demonstration here :  https://www.youtube.com/watch?v=yItHyx_MFao&feature=youtu.be



Also, PG4 has been very buggy for me.  I'm using windows 7 and since I've installed the program, I get regular pop-ups for the "object" general error message, even after I close the program.  I know that software is a constant battle, but honestly, it's just not a battle that I'm excited about fighting, being that the end result isn't exactly what I want anyway.

>>>> Understood. What version do you have? We've been quite busy tidying up recently.  This is the latest version:

          http://www.personalgroupware.com/groupmanager/pg-offline-4-0-857.zip

  If you still experience "object" errors with 857 I would be very interested to hear about it.


Another major reason is trust.  We trusted Yahoo to be good stewards of our archive, and they failed when they made the neo changeover.  I've learned that it is probably best not to trust any third party when we really want to have the data.  Please don't take it personally, but I don't trust that your software will be available and usable for as long as I'd like for it to be.

>>>> Well I have been at this for 10 years but I do take your point. Having said that, is SQLite going anywhere? I very much doubt that. 

  "SQLite is the most widely deployed database engine in the world. The source code for SQLite is in the public domain."

   So... not going anywhere then.

   Additionally, I am going to include a basic "Export to Text" option in PGO which will allow you to save messages as a comforting text file.



I don't want to sound ungrateful for the work that you have done, because your software is a crucial step for me.

>>>> You betcha.  Something you should also consider is that PGO will not hold your data hostage. When the trial expires you will still be able to read, sort & search your messages as usual. You will not be able to download any new messages but that doesn't sound like its going to be an issue for your group.

All your group members can download PGO, import your edited archive and then use PGO indefinitely for free.

It doesn't get better than that.


Cheers,

Wilson.





Wilson Logan

Have a look at this thread for details of the text export feature:

http://www.personalgroupware.com/smf/index.php?topic=1207.0

Jared

Thanks Wilson, many good points and I appreciate your responsiveness. The ability to clean up the group once and redistribute the archive is huge.  I had envisioned creating a web interface for the old content, but being that your trial allows access to previous downloads may makes using pg4 attractive too.  I'm likely to buy a license for myself, and some of my users may be in active groups from other walks of life, and to them, a licensed pg4 is going to be appealing.  They likely don't know about it yet.  These are all good reasons to look towards using your program for delivering the refined database, if we can wade through a couple of the more annoying bugs.

As for the refining and cleaning the old data, I'm not yet sure that the current pg4 is the tool for that, mostly because of the slowness of the processes.  I've done a little sample editing in your interface, and what I haven't found yet is the ability to quickly select and delete a whole thread, preferably from the list interface.  Selecting a list of messages and then right clicking and selecting from a dialog isn't going to work with the volume that I'm facing. I also haven't found how to quickly delete just a single message.   It doesn't appear that the keyboard "delete" key does anything in either interface.  Many of our dud messages are obvious in the list view, so being able to mass delete from the list would really speed up the first pass. 

I'd still need to make a second pass through each individual message, and I like how the individual message dialog (that I get to by double clicking on a message in the list) has a "delete" button, but I get a reliable crash every time I push it.  What I'd really love to have is another two buttons in that interface- one for "delete and go to next message" and one for "delete entire thread and go to next thread."  It may be that the current delete button already works as a "delete and go to next" button, but I haven't been able to get it to work yet.  It would be super awesome if there was a keyboard way to do these functions too, since moving the mouse really slows the operation down. Maybe a "d" for delete message, "m" to delete the whole topic, etc.

I know it may sound silly to wish for these functions instead of just using it as made, but efficiency really matters when I'm looking at processing 200,000 messages.  This is why I started by looking for a way to get the data into something like Access, where I could create my own interface that would allow me to blaze through the messages as fast as possible.  I have also considered hiring a developer to create a web-based interface that would allow me to get help from friends to do the same. I envisioned an interface where the user would be presented with a message and have keystrokes for "keep it," "trash it," and "not sure" or something like that.  Most of these decisions can be made in a fraction of a second, because the message is either on topic, or it's obviously about something nutty like politics.  The key is that the software process needs to be equally fast for us to make it worth filtering the entire archive.  I have plenty of willing and qualified helpers who would work for the benefit of having a good archive available.  If pg4 was more "mass edit" friendly, I could assign folks ranges of the archive, such as "you do messages 10k-50k, you do 50k-90k, etc.  That way we could work on a single archive without stepping on each other.  Heck, if that was an option, I might even buy the licenses for the editors.  It would be a good investment for the time I'd save.  On that topic, I'd be willing to negotiate for a minimum number of license purchases (or just an outright money transfer) if you could add satisfactory fast mass-editing tools like I describe.

As for the import into SQLite, thanks for the link, but that thread seems to address more of what to do once one has the data into that program, and I can't get to that point.  I'm still hung up on getting the pg4 file to import at all.

I am using version 857 and would be glad to tell you more about the bugs and error messages.  Would you like to do that here or via a private discussion?  I may need guidance on what you'd like for me to check for, but I can pretty easily send you a paste of the text associated with the most common errors.

Wilson Logan

>>>> Hi Jared,

As for the refining and cleaning the old data, I'm not yet sure that the current pg4 is the tool for that, mostly because of the slowness of the processes.  I've done a little sample editing in your interface, and what I haven't found yet is the ability to quickly select and delete a whole thread, preferably from the list interface.

>>>> You can sort by any column. There is a column "Topic ID" which I believe is analogous to thread.

Selecting a list of messages and then right clicking and selecting from a dialog isn't going to work with the volume that I'm facing. I also haven't found how to quickly delete just a single message.   It doesn't appear that the keyboard "delete" key does anything in either interface.  Many of our dud messages are obvious in the list view, so being able to mass delete from the list would really speed up the first pass. 

>>>> Delete is a context menu option. So select either one or many messages, right click, choose delete.  I agree that the delete button should be active in this scenario and I expect it will be a simple matter to implement.


I'd still need to make a second pass through each individual message, and I like how the individual message dialog (that I get to by double clicking on a message in the list) has a "delete" button, but I get a reliable crash every time I push it. 

>>>> Noted. That will be fixed.


What I'd really love to have is another two buttons in that interface- one for "delete and go to next message" and one for "delete entire thread and go to next thread."  It may be that the current delete button already works as a "delete and go to next" button,

>>>> It should.

but I haven't been able to get it to work yet.  It would be super awesome if there was a keyboard way to do these functions too, since moving the mouse really slows the operation down. Maybe a "d" for delete message, "m" to delete the whole topic, etc.

>>>> Well, I'd do with "delete" for delete but I think we need a two button option for delete thread due to the dangerous nature of such a thing. Given that it has no "Are you sure" dialog.  Perhaps Shift+Delete? Obviously there'd be menu options for both also.
 

I know it may sound silly to wish for these functions instead of just using it as made,

>>>> Not at all. I'm all for improvements.


  If pg4 was more "mass edit" friendly, I could assign folks ranges of the archive, such as "you do messages 10k-50k, you do 50k-90k, etc.  That way we could work on a single archive without stepping on each other.

>>>> I was going to suggest that. You can restrict messages to a range in Export so that your editors are only editing their selected sections that you send them.


Heck, if that was an option, I might even buy the licenses for the editors.  It would be a good investment for the time I'd save.  On that topic, I'd be willing to negotiate for a minimum number of license purchases (or just an outright money transfer) if you could add satisfactory fast mass-editing tools like I describe.

>>>> I think that could be possible.


I am using version 857 and would be glad to tell you more about the bugs and error messages.  Would you like to do that here or via a private discussion?  I may need guidance on what you'd like for me to check for, but I can pretty easily send you a paste of the text associated with the most common errors.

>>>> Yes, please email me at info (at) personalgroupware.com

Thanks,

Wilson.

Wilson Logan

BTW a small note about Yahoo Groups threading....

Yahoo Groups did not adopt true threading till about February 2002. Before that all messages have threads but no two are in the same thread i.e. message 600 is in thread 600, message 1237 is in thread 1237, etc.

Cheers,

Wilson.

SMF spam blocked by CleanTalk