research

Journal migration tool

I've written a command-line tool for migrating journal entries from any LJ-style server to any other LJ-style server. This tool needs testers. It should run on any system with a recent Python installed. That means OS X out of the box, most Linux distros, and any Windows system where the user has installed python.

Get it
Github: ljmigrate
Tarball: ljmigrate.tar.gz
Zipfile: ljmigrate.zip
Latest version: 1.5 090110a Sat Jan 10 23:51:35 PST 2009
Documentation: README, README for Windows

Features
- Archives entries, comments, and user pics locally.
- Can optionally generate simple html versions of entries with comments and a userpic.
- Can optionally migrate from one LJ-style server to another. E.g., from LiveJournal to InsaneJournal. Post metadata (user pic keywords, mood icon, tags, music, location) is preserved when the destination supports them. Privacy info is preserved: if a post is private or flocked in the source, it will be private or flocked at the destination. Custom filters aren't moved, though.
- Can optionally migrate only posts with particular tags, if you want to move only a portion of a journal to a new service.
- Archives communities. Can also migrate them. Community comments are also archived. See README.
- Cannot migrate comments or memories. These limitations are LJ limitations.

Tutorials
karma_apple's illustrated tutorial for Windows users
alter_writes's illustrated tutorial for Mac users
lumnata's detailed instructions

Instructions for Terminal-comfy Mac users:
Download the tarball: ljmigrate.tar.gz
Safari will warn you that it might contain an application. (Oh noes!) Firefox/Camino won't.
Unpack it. (Double-clicking works.)
Run the Terminal.
Change directories to wherever it was you unpacked the file. E.g., cd ~/Desktop/ljmigrate
Read the README.
Edit the config file ljmigrate.cfg to mention your accounts.
Run the tool in the Terminal: ./ljmigrate.py
Watch. Wait. Report results to me in a comment here.
To use a new version of the tool, just move or copy your account folder into the folder for the newer version.
./ljmigrate.py --help provides usage info.

Instructions for Terminal newbies are in the README.

Thanks to: ldybastet & kannnichtfranz for early testing & bug-reporting.

Known bugs:
1) Migration will fail for posts that use LJ features you don't have permission to use on the destination. For instance, polls: if your LJ post has a poll but you haven't paid for polls on IJ, it'll fail.

Call for Windows help
I'd like to write up detailed instructions for non-technical Windows users who'd like to use this (say, to migrate communities). Could one of you give me a hand with this task? Thanks!
  • Current Mood: cynical
  • Current Music: Polar Bear : Ride : Nowhere
Tags: , ,
For the moment, it is. Fandom is panicking again. Four hundred freakin' million links about it here. Short version: two somebodies posted art of under-18-looking Harry Potter characters having sex. They got tossed by LJ without warning. Fandom freaks.

I have no intention of moving at the moment. I'm just handing fandom tools. I'll give plenty of warning if Buffy fandom moves and I think it's the right thing to do.
Hello,
Working on a Mac OS X terminal, and this keeps coming up, even though I follow your instructions to the letter.

Sebastian-Redux:~/documents/ljmigrate lyds$ ./ljmigrate.py
Traceback (most recent call last):
File "./ljmigrate.py", line 526, in ?
main()
File "./ljmigrate.py", line 316, in main
fetchConfig()
File "./ljmigrate.py", line 279, in fetchConfig
error("Problem reading config file: %s" % str(e))
NameError: global name 'error' is not defined


Is there something in particular I'm doing a bit wrong?

Thanks, by the way, for doing this.
Ah. Okay, there are two errors here: you haven't got a config file! And I've got a bug *reporting* the error to you! Did you copy ljmigrate.cfg.sample to ljmigrate.cfg and edit it?

I'll upload a fixed error-reporting step in a moment as well... Okay. New version uploaded.
perhaps I'm being a bit thick, but could you explain this again, please?
thanks!
Those are the expert version of the instructions, because I am looking for slightly more expert users to help me test. If you're feeling brave, I can write up the Terminal Newbie version of the instructions right now.
I'm too much of a newbie to be able to test it at this stage, so I apologize if this seems like pestering, but I'm curious, will it grab comments as well, or only posts?
It grabs comments, yes, and archives them along with each post. It cannot migrate the comments, because there's no way to add comments via the API. Only posts can be migrated. (Well, friends as well, but since account names aren't guaranteed to be the same from one system to another, that's a dicey proposition.)
Awesome tool!! I was planning to use MacJournal to transfer, but this might be better. MacJournal doesn't preserve privacy settings.

I have only tried it with an old, barely used RP account so far, which only had seven entries. I think it got through all the posts and then died on comments.

Fetching journal entry L-7 (update)
    re-posting journal self.__dict__...
Fetching journal comments for: investor_ko
Traceback (most recent call last):
  File "ljmigrate.py", line 767, in ?
    main()
  File "ljmigrate.py", line 698, in main
    allEntries[id].emitPost(htmlpath);
  File "ljmigrate.py", line 414, in emitPost
    content = content.replace("\n", "
\n"); AttributeError: Binary instance has no attribute 'replace'


The new journal appears to have all the new posts in place, though. This is the content of my journal directory:

> tree investor_ko/
investor_ko/
|-- entry00002
|   |-- comments.xml
|   `-- entry.xml
|-- entry00003
|   |-- comments.xml
|   `-- entry.xml
|-- entry00004
|   `-- entry.xml
|-- entry00005
|   `-- entry.xml
|-- entry00006
|   |-- comments.xml
|   `-- entry.xml
|-- entry00007
|   |-- comments.xml
|   `-- entry.xml
|-- entry00008
|   `-- entry.xml
|-- html
|   |-- 00002.html
|   `-- 00003.html
|-- metadata
|   |-- comment.meta
|   |-- entry_correspondences.hash
|   |-- last_sync
|   |-- user.map
|   `-- userpics.xml
`-- userpics
    |-- default.png
    |-- inquisitive.png
    |-- roses.png
    |-- serious.png
    |-- smile.png
    |-- smug.jpeg
    `-- welcome.png


Let me know if there's any other info I can give you.

Thanks for all your efforts!
I, too, got an Error when getting the comments:

Fetching journal comments for: ldybastet
Traceback (most recent call last):
File "./ljmigrate.py", line 767, in ?
main()
File "./ljmigrate.py", line 698, in main
allEntries[id].emitPost(htmlpath);
File "./ljmigrate.py", line 423, in emitPost
result = result + c.emit()
File "./ljmigrate.py", line 460, in emit
result.append('%s: %s
' % (self.user, self.subject))
AttributeError: 'Comment' object has no attribute 'user'
Ooh, nice one. I didn't anticipate that. I've uploaded a fix. The script died at the very last stage, while generating the html, so your journal migration/archive *should* have gone swimmingly. Did it produce the results you expected?
I used this around two in the morning, but didn't feel coherent enough to leave feedback.

The tool successfully migrated 1197 posts in about an hour. I got two of the following messages:

Error getting item: [Item number here]
[Error: Irreparable invalid markup ('<fault [...] 'client>') in entry. Owner must fix manually. Raw contents below.]

I used this around two in the morning, but didn't feel coherent enough to leave feedback.

The tool successfully migrated 1197 posts in about an hour. I got two of the following messages:

Error getting item: [Item number here]
<Fault 208: 'Client error: Invalid text encoding: Cannot display this post. Please see http://www.livejournal.com/support/encodings.bml for more information.'>

But that seems to be a problem with the original post and not the migration.

I also got the following message when it was finished:

Fetching journal comments for: weirdquark
Traceback (most recent call last):
File "./ljmigrate.py", line 600, in ?
main()
File "./ljmigrate.py", line 538, in main
print "%d entries, %d comments, %d comments by user, %d userpics" % (newentries, newcomments, commentsBy, len(userpics))
UnboundLocalError: local variable 'userpics' referenced before assignment

Not sure what happened there -- I set icon migration as false since I had a) moved them by hand already, and b) the README says they don't migrate anyway, but maybe that's the reason for the error.

Other than that, everything that I've checked worked perfectly. Thanks!
Yes, I suspect the first error is LJ's fault, and nothing I can do anything about; I merely skip that entry. The second bug is one I noticed & fixed this morning; argh that it bit you in the field, but it was the VERY last step and informative output only. So no practical bad effect, at least!

Thank you for testing!
Just scanning through this post and comments... looks like you are in you element here and having a blast. :)
I'm using this and having no trouble, although I am getting the same error as weirdquark:

Fetching journal entry L-134 (create)
re-posting journal entry...
Error getting item: L-134
[Error: Irreparable invalid markup ('<socket.gaierror>') in entry. Owner must fix manually. Raw contents below.]

I'm using this and having no trouble, although I am getting the same error as weirdquark:

Fetching journal entry L-134 (create)
re-posting journal entry...
Error getting item: L-134
<socket.gaierror instance at 0x4ed378>


But if this is an LJ fault, then no worries. Just chiming in. So far my entries are reposting to journalfen without any trouble. (Except the ones that can't be fetched in the first place.)
Hi. Found you on elke_tanzer's list. I'm getting this error when I run ./ljmirgate.py:

Fetching journal entries for: nm973
Fetching userpics for: nm973
Traceback (most recent call last):
File "./ljmigrate.py", line 803, in ?
main(retryMigrate)
File "./ljmigrate.py", line 533, in main
userpics = dict(zip(r['pickws'], r['pickwurls']))
TypeError: unhashable instance

I'm only trying to backup my LJ, not migrate. I did change that to False in my config file. Thanks for your help!!!
Honestly. I might just have to propose to you.
Tried it, and it mostly works but I get errors on the posts with current location such as:
Error getting item: L-168
[Error: Irreparable invalid markup ('<fault [...] current_location'>') in entry. Owner must fix manually. Raw contents below.]

Tried it, and it mostly works but I get errors on the posts with current location such as:
Error getting item: L-168
<Fault 205: 'Client error: Unknown metadata: current_location'>
This could just be that GreatestJournal (where I'm migrating to) just doesn't support it, I suppose.

And when it was saving to HTML, I got this:
skipping post 119 because of error: 'ascii' codec can't decode byte 0xc3 in position 90: ordinal not in range(128)

Other than that, works wonderfully! Thanks!
Ooh, nice bug report, thanks. I'll handle that in my next release of the script. (Both the GJ location thing & the unicode character problem.)
Trial and Error: The Remix
Let's try that again. Here is the error message I got:

Fetching journal entries for: daiseechain
Created subdirectory: daiseechain
Traceback (most recent call last):
File "./ljmigrate.py", line 833, in ?
main(retryMigrate)
File "./ljmigrate.py", line 497, in main
gSourceAccount.makeSession()
File "./ljmigrate.py", line 97, in makeSession
response = self.handleFlatResponse(r)
File "./ljmigrate.py", line 107, in handleFlatResponse
while True:
NameError: global name 'True' is not defined


Hopefully there will be actual line breaks this time...
Re: Trial and Error: The Remix
I know exactly what this problem is: you're running an older version of Python. Which means you're on a very dusty version of OS X as well. You can run "python -V" in the shell and find out exactly which version. (It's before 2.3, I know that. I've tested my tool with 2.3, 2.4, and 2.5.)

I'll see what I can do about making the tool compatible with older Pythons. *think*
I'm...stupid. But This is what it's saying as I try to migrate to journalfen:

migrating journal entry...
Error getting item: L-4
[Error: Irreparable invalid markup ('<protocolerror [...] www.journalfen.net//interface/xmlrpc:>') in entry. Owner must fix manually. Raw contents below.]

I'm...stupid. But This is what it's saying as I try to migrate to journalfen:

migrating journal entry...
Error getting item: L-4
<ProtocolError for www.journalfen.net//interface/xmlrpc: 404 Not Found>
Fetching journal entry L-5 (create)
migrating journal entry...
Error getting item: L-5
<ProtocolError for www.journalfen.net//interface/xmlrpc: 404 Not Found>
Fetching journal entry L-6 (create)
migrating journal entry.

Duh?
Ah. It's the double //. I can fix that in the source, but in the meantime, could you remove the trailing slash from your config file setup for your journalfen account?
Just tried this for the first time and for the most part, it's worked pretty well! Icons are working, posts are there, locked posts are locked. No comments, though. Here's everything that showed up once the fetching/migrating finished:

1977 entries migrated or updated on destination.
Fetching journal comments for: alto2
Now generating a simple html version of your posts + comments.
skipping post 970 because of error: 'ascii' codec can't decode byte 0xc3 in position 1452: ordinal not in range(128)
skipping post 979 because of error: 'ascii' codec can't decode byte 0xc3 in position 5362: ordinal not in range(128)
Local archive complete!
1977 entries, 10173 comments, 1601 comments by user, 122 userpics

There are many more skipped posts than I've included--the comment was too long to include them all, but the error was the same for all of them. Let me know if you'd like me to try anything else--and thanks for this! I'm using it as a backup at this point, and that's a nice thing to have.
I think I have fixed that error, but there's no need for you to run the tool again if you don't want to-- the problem was with generating local html, not with migration. Please tell me if I'm wrong :)
I am not only new to terminal, but also new to Macs (very new, I've had mine about a week) and I seem to be running this with no problem.

I will come back and let you know how it goes.

Thank you SO MUCH for sharing this with us.
I clearly spoke to soon! :D

It handles all of the icons fine, but then I get this:
Fetching journal entry L-16 (create)
migrating journal entry...
Error getting item: L-16
Traceback (most recent call last):
File "./ljmigrate.py", line 870, in ?
main(retryMigrate)
File "./ljmigrate.py", line 630, in main
traceback.print_exc(x, 5)
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/traceback.py", line 210, in print_exc
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/traceback.py", line 122, in print_exception
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/traceback.py", line 13, in _print
AttributeError: 'int' object has no attribute 'write'


I've rerun it a few times and even gone through, deleted the tool and file, redownloaded it and run it again and I still get that error.

I'm sure it's something I'm screwing up, I just can't figure out what.

Thanks again for all the amazing work you've done.
Hooray! It'll be pretty smoothed out in a day or two, I think. People are definitely whacking on it with hammers.
I adore you. Thank you for doing this.

I migrated my LJ to IJ, and it ran fine for 325 entries (in about 15 minutes), and then died with this:

Fetching journal entry L-325 (create)
migrating journal entry...
Fetching journal entry L-326 (create)
migrating journal entry...
Error getting item: L-326
Traceback (most recent call last):
File "./ljmigrate.py", line 870, in ?
main(retryMigrate)
File "./ljmigrate.py", line 630, in main
traceback.print_exc(x, 5)
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/traceback.py", line 210, in print_exc
print_exception(etype, value, tb, limit, file)
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/traceback.py", line 122, in print_exception
_print(file, 'Traceback (most recent call last):')
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/traceback.py", line 13, in _print
file.write(str+terminator)
AttributeError: 'int' object has no attribute 'write'


Checking my IJ, I see that posts are there up to April 2006. Custom filters seem a little weird, but I can deal with that, no problem.

I gather that when whatever happened at post 326 is straightened out, I can run ljmigrate with the retry flag to migrate the remainder to IJ? The README mentions the retry flag, but also says that to run a newer version without re-migrating everything, "The folder named after your account has all the interesting data in it. You can copy it and move it around. You can move it from the folder holding the *old* version of the tool into the new one. Run the new version. Tada!" but I'm not certain exactly what that means; does it mean that if entries are already in the folder, ljmigrate will not (re)download them from the source journal, but will upload them to the target journal? (I think some confusion is arising because I'm not sure whether by "migrate" you mean "download from source," "upload to target," or both. I'd be very grateful if you clarified this; I'm not sure I'm always correctly understanding you!)

Also, if, having copied my LJ entries to InsaneJournal, I want to copy them again to GreatestJournal, is there a way to do that without re-downloading them? Do I just edit cfg to list GJ as target, and leave all the entries in the local folder so ljmigrate knows they're there?

(One weirdness that probably has nothing to do with your awesome tool; when I double-clicked on the tar.gz file, I ended up with two copies of every file: those with the correct names were empty files, and those that actually had contents were named ljmigrate.1.cfg, ChangeLog.1, and so on. I deleted the empty ones and renamed the good ones to have the right names (deleting the ".1.") before proceeding.)
If you save your existing archive folder, and run with the --retry flag, the following will happen:
- entries that were already archived will get archived again
- entries that DIDN'T get migrated over to the destination the first time will be migrated (or we'll try to migrate them)

If you want to mirror to a second destination, er.... I should implement that. Good feature idea. At the moment, the best way to do that would be to move your archive folder out of the ljmigrate folder, so the tool can't find it and thinks you're starting fresh.
moron time
I am terminal-friendly; I used your fabulous tool for sorting a backed-up journal into neat, chronological folders the last time we had the LJ Panic bug going around, so I'm not afraid to try new things. But here's where I start to glaze over --

Is this a tool which allows posting to multiple journals at the same time? Or is this something to do with RSS? Or is this something entirely different?
Re: moron time
This is something entirely different. What this tool does is:
- Archive posts locally (in a somewhat messier format than that last one, eep)
- Optionally, at the same time repost everything to another journal site.
- Archive community posts locally. Optionally, repost *those* at another journal site.

You can use it to copy everything on your LJ to your brand new GreatestJournal, for instance. And then you can *run it again* to copy over new posts, and posts edited since the last time you ran it. So it can keep two journals synchronized.

So it does something similar to the other tool, but it does a few more things as well. And when I'm done with it, it'll have the corkscrew attachment and a bottle opener.
I'm a total terminal newbie, but this was seriously easy to follow and I just successfully migrated (didn't archive) 1021 posts from livejournal to insanejournal. As far as I can tell the only issue was the polls. Thanks so much for this, I've been hanging out for a fairly easily understood Mac alternative to migrate content since Strikethrough first went down, and this is brilliant!
I'm not particularly interested in migrating, but I am interested in something more satisfactory than export.bml for backing up my LJ. Using Terminal and Python is not a problem. I do have a few questions, though.
- If I save a copy of everything locally, is there some option to put up what I've saved again somewhere else? Or does the migrate feature only work when connected to LJ directly?
- Can the local copy be viewed in my browser or uploaded to some non-journaling webhost?
- export.bml doesn't know about tags — does this?
- If I use this on a semiregular basis to maintain a local backup of my LJ, will it download everything from scratch every time I run it, or can it be configured to update the backup copy with only recent changes? And if it does have some sort of incremental backup ability, can it determine when older items have been edited or commented on and re-download them?
1. Not yet, but it will be soon. (This is #1 on my to-do list.)
2. The local html copy can be viewed in your browser, and uploaded anywhere you like.
3. Yes, it knows about tags!
4. It does not download from scratch every time-- it uses LJ's concept of "syncitems" to retrieve only items that have changed since the last run. It will migrate any new items found since the last run if you wish.
Thanks so much for putting this together. I did a migration and got a couple of errors:

Error getting item: L-185
Traceback (most recent call last):
File "./ljmigrate.py", line 681, in synchronizeJournals
result = gDestinationAccount.postEntry(entry)
File "./ljmigrate.py", line 203, in postEntry
result = self.server_proxy.LJ.XMLRPC.postevent(params)
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/xmlrpclib.py", line 1032, in __call__
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/xmlrpclib.py", line 1319, in __request
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/xmlrpclib.py", line 1073, in request
ProtocolError:
[Error: Irreparable invalid markup ('<protocolerror [...] journalfen.net/interface/xmlrpc:>') in entry. Owner must fix manually. Raw contents below.]

Thanks so much for putting this together. I did a migration and got a couple of errors:

<blockquote>Error getting item: L-185
Traceback (most recent call last):
File "./ljmigrate.py", line 681, in synchronizeJournals
result = gDestinationAccount.postEntry(entry)
File "./ljmigrate.py", line 203, in postEntry
result = self.server_proxy.LJ.XMLRPC.postevent(params)
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/xmlrpclib.py", line 1032, in __call__
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/xmlrpclib.py", line 1319, in __request
File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/xmlrpclib.py", line 1073, in request
ProtocolError: <ProtocolError for journalfen.net/interface/xmlrpc: 500 Internal Server Error>
Fetching journal entry L-186 (create)
migrating entry to adbaculum
migrating entry to adbaculum
migrating entry to adbaculum
Fetching journal entry L-187 (update)
migrating entry to adbaculum
migrating entry to adbaculum
migrating entry to adbaculum
Fetching journal entry L-188 (create)
migrating entry to adbaculum
migrating entry to adbaculum
migrating entry to adbaculum
Password on destination is incorrect. Check your config and try again.</blockquote>


I checked all my passwords and they were correct, so I'm not sure why I got a wrong password message. Any advice?
That is interesting. Okay, the error message is overly specific. But the error is on JournalFen's side, and it's likely transient. I will make it retry in this case, however. One sec; new version in mere minutes.
This is really awesome! I ran into a snag, though. It successfully transferred 277 entries, then gave me this:

277 entries migrated or updated on destination.
Fetching journal comments for: exsequar
Traceback (most recent call last):
File "./ljmigrate.py", line 990, in ?
main(retryMigrate)
File "./ljmigrate.py", line 885, in main
synchronizeJournals(gMigrate)
File "./ljmigrate.py", line 821, in synchronizeJournals
cmt = Comment(comment)
File "./ljmigrate.py", line 560, in __init__
self.__dict__[k] = dict[k].decode('utf-8', 'replace')
AttributeError: 'unicode' object has no attribute 'decode'

Then it stopped. Looking at my GJ, posts up through Oct 5, 2005 made it, but without any comments or icons. I just started running it again, and it started at the next entry and is going fine, but I don't know if it will hit the same snag when it tries comments again. I have no technical knowledge about any of this, so sorry if I'm being dumb! :) You've put together a really easy to use system here, thank you so much!
Two pieces of bad news:
- Comments can't be migrated. (They can be backed up locally, though.)
- Icons can't be migrated. (They can be backed up locally as well! For easy re-upload. All you need to do is upload them at GJ, with the same keywords, and your posts will have the right icons again.)

One piece of good news:
I think I fixed the other bug. I have had the worst time today with coercing data into unicode strings in Python. Grrrrrr. Please grab the latest? Thanks for putting up with my bugs.
Success!
Wow. That completely rocked. Made a local copy of my 48 entries in probably less than a minute, and the install and setup took just a few. I had forgotten how much I love command-line interfaces. Thanks so much for putting this out here.
Re: Success!
Yay! Small data sets do better :) It's the people with huge amounts of unpredictable data who are causing me all the headaches.
*hugs you to death* It's WORKING OMG :). I'd resigned myself to just migrating my stuff to WordPress by use of my sister's Windows laptop, but now I can actually *do* the crossposting thing if I want! Thanks buckets, will tell you if any bugs come in.