Page tree
Skip to end of metadata
Go to start of metadata

I ran all of this on the LF supplied server at: testresults.opnfv.org

Download Universal Wiki Converter

git clone https://github.com/rachetfoot/universal-wiki-converter.git

You will need ant and a JDK installed.  Build and create the JAR file.

ant
ant CreateExecutableJarFileWithExternalLibrary

Fetch Pages from DokuWiki

You need a copy of the raw filesystem where DokuWiki stores its pages for this.  Put these files into a location where the UWC you just built can read them.

Extract them into your home directory (or replace the paths to Refactor DokuWiki below)

Fixup Pages from DokuWiki

There are a couple pages with content that RefactorDW does not like.  Let's just do a quick cleanse here:

sed -i 's/\*\*Release plan for "vSwitch Performance"\*\*/Release plan for "vSwitch Performance"/' ~/data/pages/characterize_vswitch_performance_for_telco_nfv_use_cases.txt
sed -i 's/|{{::mindthegap.jpg?60 |}}/|mindthegap.jpg/' ~/data/pages/requirements_projects.txt
sed -i '/wenjing_chu@dell.com/d' ~/data/pages/playground/playground.txt

Refactor DokuWiki Content

A tool has been written to help with the renaming of non-unique page names in DokuWiki.  For example, each project probably has a "start.txt", but Confluence will only allow one page called "start.txt" to exist per site.  As the UWC only allows up to upload to a single site, all these "start.txt" pages must be renamed to a unique name.  Hence, the RefactorDW tool was written:

http://sourceforge.net/projects/refactordw/

mkdir ~/RefactorDW
cd ~/RefactorDW
wget http://downloads.sourceforge.net/project/refactordw/refactordw/RefactorDW%201.2/RefactorDW_1.2_build84.zip
unzip RefactorDW_1.2_build84.zip
sed -i 's/namespace_include_level value=".*"/namespace_include_level value="100"/' config/refactordw_config.xml
sed -i 's/media_root_dir value=".*"/media_root_dir value="..\/data\/media"/' config/refactordw_config.xml
sed -i 's/pages_root_dir value=".*"/pages_root_dir value="..\/data\/pages"/' config/refactordw_config.xml

Duplicate Filenames

For some reason, RefactorDW has a problem resolving the following file names.  We can either do this here, or perhaps rename them in DokuWiki itself prior to export.

# Old template and conflicts under releases
rm ~/data/pages/apex_milestone_d_report.txt
# Old template and conflicts under wiki
rm ~/data/pages/tsc.txt

# Conflicts
mv ~/data/pages/lsoapi/documents ~/data/pages/lsoapi/lsoapi_documents
mv ~/data/pages/lsoapi/documents.txt ~/data/pages/lsoapi/lsoapi_documents.txt
sed -i "s/lsoapi:documents/lsoapi:lsoapi_documents/g" ~/data/pages/lsoapi/lsoapi_documents/architecture.txt
sed -i "s/lsoapi:documents/lsoapi:lsoapi_documents/g" ~/data/pages/lsoapi/lsoapi_documents/meeting_docs.txt
sed -i "s/lsoapi:documents/lsoapi:lsoapi_documents/g" ~/data/pages/meetings/lsoapi.txt

mv ~/data/pages/copper/academy/joid ~/data/pages/copper/academy/copper_academy_joid
mv ~/data/pages/copper/academy/joid.txt ~/data/pages/copper/academy/copper_academy_joid.txt

mv ~/data/pages/meetings/security ~/data/pages/meetings/meetings_security
mv ~/data/pages/meetings/security.txt ~/data/pages/meetings/meetings_security.txt
sed -i "s/meetings:security/meetings_security/g" ~/data/pages/security.txt
sed -i "s/meetings:security/meetings_security/g" ~/data/pages/meetings/meetings_security.txt

mv ~/data/pages/security/meetings ~/data/pages/security/security_meetings
mv ~/data/pages/security/meetings.txt ~/data/pages/security/security_meetings.txt
sed -i "s/security:meetings/security_meetings/g" ~/data/pages/meetings/meetings_security.txt

mv ~/data/pages/requirements_projects/security ~/data/pages/requirements_projects/requirements_projects_security

mv ~/data/pages/security/security ~/data/pages/security/securityguide

sed "s/opnfv_sfc_proposal_architecture_diagram v1.png/opnfv_sfc_proposal_architecture_diagram_v1.png/" ~/data/pages/requirements_projects/openstack_based_vnf_forwarding_graph.txt

Perform the Conversion

java -Xmx1G -jar RefactorDW.jar

This should take about 3 minutes to complete, and it leaves all its output in /tmp/refactordw_workspace/

You will also see the following messages at the end of the conversion

DEBUG (?:?) - NameSpaceOperations.printArticleNamespaceCollisions(...): Collisons for article: opnfv_functional_testing
DEBUG (?:?) - NameSpaceOperations.printArticleNamespaceCollisions(...):      -> with name space: opnfv_functional_testing
...
DEBUG (?:?) - NameSpaceOperations.printArticleNamespaceCollisions(...): Collisons for article: documents
DEBUG (?:?) - NameSpaceOperations.printArticleNamespaceCollisions(...):      -> with name space: lsoapi:documents
ERROR (?:?) - RefactorDW.main(...): An error occured while resolving naming conflicts
org.digitalcure.refactordw.util.exception.RefactorDWException: There are still 38 naming collisions between articles and name spaces
        at org.digitalcure.refactordw.operations.RefactoringManager.resolveNamingCollisionsBtwArticlesAndNS(Unknown Source)
        at org.digitalcure.refactordw.core.RefactorDW.main(Unknown Source)

As long as the DEBUG level collisions are exact article/name space matches, it is good.  This just means that there will be a document at the top level of each name space.

Fixup Attachment Locations

For some odd reason, UWC sometimes wants to find attachments in a subdirectory of the attachment directory.  I used the following to make it easier for UWC to find them:

mkdir /tmp/refactordw_workspace/media/tmp
ln -s /tmp/refactordw_workspace/media /tmp/refactordw_workspace/media/tmp/refactordw_workspace

Configure UWC

First, you need to tell it where your Confluence server is:

current.tab.index=0
space=<name of Confluence Space to upload into>
url=https://confluence.opnfv.org
trustpass=
pages=<path to dokuwiki directory for subset of pages to load>
uploadOrphanAttachments=false
pageChooserDir=/tmp/refactordw_workspace/pages
attachments=/tmp/refactordw_workspace/media
trustall=true
attachment.size.max=-1
sendToConfluence=false
pattern=
login=<your LFID>
truststore=
feedback.option=true
password=<your LF Password>
wikitype=dokuwiki

Be sure to fill in the login and password fields with your own Confluence credentials.  To ensure it works, launch the tool:

cd target/uwc
./run_cmdline.sh -t conf/confluenceSettings.properties 

If successful, you will see the following output:

2016-02-19 19:41:11,885 INFO  [main] - UWC connected successfully with Confluence.
2016-02-19 19:41:11,895 INFO  [main] - Test Connection: SUCCESS

Dokuwiki Converter Properties

Change target/uwc/conf/converter.dokuwiki.properties to the following.

DokuWiki.01.hierarchy-builder=com.atlassian.uwc.hierarchies.DokuwikiHierarchy
DokuWiki.02.switch.hierarchy-builder=UseBuilder

DokuWiki.03.filepath-hierarchy-ignorable-ancestors.property=/tmp/refactordw_workspace/pages
DokuWiki.04.filepath-hierarchy-ext.property=
DokuWiki.05.filepath-hierarchy-matchpagename.property=false

DokuWiki.001.hierarchy-homepage-dokuwiki-filename.property=start
DokuWiki.001.space-wiki.property=wiki


DokuWiki.001.spacehandler.class=com.atlassian.uwc.converters.dokuwiki.SpaceConverter
DokuWiki.001.spaceperms.property={groupname}group{permissions}VIEWSPACE,EDITSPACE,EXPORTPAGE,SETPAGEPERMISSIONS,REMOVEPAGE,EDITBLOG,REMOVEBLOG,COMMENT,REMOVECOMMENT,CREATEATTACHMENT,REMOVEATTACHMENT,REMOVEMAIL,EXPORTSPACE


DokuWiki.001.attachmentdirectory.property=/tmp/refactordw_workspace/media/

DokuWiki.002.code.java-regex-tokenizer=(?s)<code>(.*?)<\/code>{replace-with}{code}$1{code}
DokuWiki.002.code-tsql.java-regex-tokenizer=(?s)\<code (tsql)\>(.*?)<\/code>{replace-with}{code:sql}$2{code}
DokuWiki.002.code-type.java-regex-tokenizer=(?s)\<code ([^> ]+).*?\>(.*?)<\/code>{replace-with}{code:$1}$2{code}
DokuWiki.002.noformat.java-regex-tokenizer=(?s)%%(.*?)%%{replace-with}{noformat}$1{noformat}
DokuWiki.003.leadingspacestocode.class=com.atlassian.uwc.converters.dokuwiki.LeadingSpacesConverter
DokuWiki.004.code.java-regex-tokenizer=(?s)<code>(.*?)<\/code>{replace-with}{code}$1{code}
DokuWiki.009.esc-lbrackets.java-regex=(?<!\[)\[(?!\[){replace-with}\\[
DokuWiki.009.esc-lcurlybrace1.perl=s/\\\{(?!\{)/\\ {/g
DokuWiki.009.esc-lcurlybrace2.java-regex=([^\{\\])\{(?!\{){replace-with}$1\\{
DokuWiki.010.tags.class=com.atlassian.uwc.converters.dokuwiki.TagConverter
DokuWiki.011.blogmacrohider.java-regex-tokenizer=(\{\{blog>.*?\}\}){replace-with}$1

DokuWiki.1bold.perl=s/\*\*\s*([^*]+?)\s*\*\*/*$1*/g
DokuWiki.1italic.perl=s/(?s)([^:])\/\/(.+?)\/\//$1_$2_/g
DokuWiki.1underlined.perl=s/__([^_]+?)__/+$1+/g
DokuWiki.1subscript.perl=s/(?s)<sub>(.*?)<\/sub>/~$1~/g
DokuWiki.1superscript.perl=s/(?s)<sup>(.*?)<\/sup>/\^$1\^/g
DokuWiki.1deleted.perl=s/(?s)<del>(.*?)<\/del>/-$1-/g

DokuWiki.1bad-newline.perl=s/\\\\\[/\\\\ \[/g

DokuWiki.1hr.perl=s/[     ]*-{4,}[     ]*/----/g
DokuWiki.1h1.java-regex=={6}(.*?)(?>={6}){replace-with}NEWLINEh1. $1
DokuWiki.1h11.java-regex=={6}(.*?)(?>={5}){replace-with}NEWLINEh1. $1
DokuWiki.1h2.java-regex=={5}(.*?)(?>={5}){replace-with}NEWLINEh2. $1
DokuWiki.1h21.java-regex=={5}(.*?)(?>={4}){replace-with}NEWLINEh2. $1
DokuWiki.1h3.java-regex=={4}(.*?)(?>={4}){replace-with}NEWLINEh3. $1
DokuWiki.1h31.java-regex=={4}(.*?)(?>={3}){replace-with}NEWLINEh3. $1
DokuWiki.1h4.java-regex=={3}(.*?)(?>={3}){replace-with}NEWLINEh4. $1
DokuWiki.1h41.java-regex=={3}(.*?)(?>={2}){replace-with}NEWLINEh4. $1
DokuWiki.1h5.java-regex=={2}(.*?)(?>={2}){replace-with}NEWLINEh5. $1
DokuWiki.1h51.java-regex=={2}(.*?)(?>={1}){replace-with}NEWLINEh5. $1
DokuWiki.1h6cleanup.perl=s/=+(h\\d\.)/$1/g

DokuWiki.2email.perl=s/<([\w.]+@[\w.]+)>/\[mailto:$1\]/g
DokuWiki.2mailto-alias.java-regex-tokenizer=\[\[mailto:([^\]|\s]*)\s*\|\s*([^\]]*)\]\]{replace-with}[$2|mailto:$1]
DokuWiki.2mailto.java-regex-tokenizer=\[\[mailto:([^\]]*)\]\]{replace-with}[mailto:$1]

DokuWiki.2tooltips.java-regex=\(\((.*?)\)\){replace-with}\($1\)
DokuWiki.2cl.java-regex=(~~CL~~){replace-with}

DokuWiki.2note.java-regex=\<note\>((?s).*?)\<\/note\>{replace-with}{info}$1{info}
DokuWiki.2notewarning.java-regex=\<note warning\>((?s).*?)\<\/note\>{replace-with}{warning}$1{warning}
DokuWiki.2notetip.java-regex=\<note tip\>((?s).*?)\<\/note\>{replace-with}{tip}$1{tip}
DokuWiki.2noteimportant.java-regex=\<note important\>((?s).*?)\<\/note\>{replace-with}{note}$1{note}

DokuWiki.3interwiki_wpde1.perl=s/\[\[[\\s]*wpde>([^\]\|]*)\]/[[http:\/\/de.wikipedia.org\/wiki\/$1|$1]/g
DokuWiki.3interwiki_wpde2.perl=s/\[\[[\\s]*wpde>([^\|\]]*)\|([^\]]*)\]/[[http:\/\/de.wikipedia.org\/wiki\/$1|$2]/g

DokuWiki.31.lists.class=com.atlassian.uwc.converters.dokuwiki.ListConverter
DokuWiki.32.list-additional-newline.java-regex=(?<=^|\n)([*#] [^\n]*\n)(?![*#\n]){replace-with}$1NEWLINE

DokuWiki.21.prep-colspans.class=com.atlassian.uwc.converters.dokuwiki.PrepColSpansConverter
DokuWiki.22.prep-rowspans.class=com.atlassian.uwc.converters.dokuwiki.PrepRowSpansConverter
DokuWiki.23.table1.perl=s/\^/||/g

DokuWiki.3interwiki_doku1.perl=s/\[\[[\\s]*doku>([^\|\]]*)\|([^\]]*)\]/[[http:\/\/wiki.splitbrain.org\/$1|$2]/g
DokuWiki.3interwiki_doku2.perl=s/\[\[[\\s]*doku>([^\]\|]*)\]/[[http:\/\/wiki.splitbrain.org\/$1|$1]/g
DokuWiki.3interwiki_wiki1.perl=s/\[\[[\\s]*wiki>([^\|\]]*)\|([^\]]*)\]/[[http:\/\/c2.com\/cgi\/wiki?$1|$2]/g
DokuWiki.3interwiki_wiki2.perl=s/\[\[[\\s]*wiki>([^\]\|]*)\]/[[http:\/\/c2.com\/cgi\/wiki?$1|$1]/g
DokuWiki.3interwiki_wp1.perl=s/\[\[[\\s]*wp>([^\|\]]*)\|([^\]]*)\]/[[http:\/\/en.wikipedia.org\/wiki\/$1|$2]/g
DokuWiki.3interwiki_wp2.perl=s/\[\[[\\s]*wp>([^\]\|]*)\]/[[http:\/\/en.wikipedia.org\/wiki\/$1|$1]/g

DokuWiki.4image2.class=com.atlassian.uwc.converters.dokuwiki.HierarchyImageConverter
DokuWiki.4image2.class=com.atlassian.uwc.converters.dokuwiki.DokuwikiAttachmentConverter

DokuWiki.412.unc.class=com.atlassian.uwc.converters.dokuwiki.UNCConverter

DokuWiki.413.external-internal-links.class=com.atlassian.uwc.converters.dokuwiki.ExternalInternalLinksConverter
DokuWiki.413.external-internal-links-identifier.property=https?:\/\/(?:(?:wiki.opnfv.org)|(?:wiki))\/

DokuWiki.42.link2.class=com.atlassian.uwc.converters.dokuwiki.HierarchyLinkConverter
DokuWiki.43.title.class=com.atlassian.uwc.converters.dokuwiki.HierarchyTitleConverter

DokuWiki.43.link2postalias.java-regex=\[([^|\]]+)\|[^\]]+(?>--)\s*{replace-with}[$1|
DokuWiki.43.link2postprocess.java-regex=\[[^\]]+(?>--)\s*{replace-with}[

DokuWiki.5monospaced.perl=s/''([^']+)''/{{$1}}/g

DokuWiki.5smiley1.perl=s/:-D/:D/g
DokuWiki.5smiley2.perl=s/;-\)/;)/g
DokuWiki.5smiley3.perl=s/:\?:/(?)/g
DokuWiki.5smiley4.perl=s/:!:/(!)/g

DokuWiki.62.discussion-to-comments.class=com.atlassian.uwc.converters.dokuwiki.DiscussionConverter
DokuWiki.62.remove-discussion.java-regex=~~DISCUSSION[^~]*~~[\n]?{replace-with}

DokuWiki.65.doublebs.java-regex=(?m)\\\\${replace-with}
DokuWiki.7.meta-dir.property=/tmp/refactordw_workspace/meta

DokuWiki.91.detokenizer.class=com.atlassian.uwc.converters.DetokenizerConverter

DokuWiki.92.htmltags.java-regex-tokenizer=(?s)<html>(.*?)</html>{replace-with}$1

DokuWiki.921.blogmacrohider.java-regex-tokenizer=(\{\{blog>.*?\}\}){replace-with}$1
DokuWiki.951.confluencemarkuptoxhtml.class=com.atlassian.uwc.converters.ConfluenceMarkupToXhtml
DokuWiki.951.engine-markuptoxhtml.property=false
DokuWiki.952.tagcloud.java-regex=[~]<sub>TAGCLOUD</sub>[~]{replace-with}<p><ac:macro ac:name="listlabels" /></p>

DokuWiki.96.table-rowandcolspans.class=com.atlassian.uwc.converters.dokuwiki.TableRowColSpanConverter

DokuWiki.97.fixtokenizertokens-start.java-regex=<sub>(?=UWCTOKENSTART){replace-with}~
DokuWiki.98.fixtokenizertokens-end.java-regex=(?<=UWCTOKENEND)</sub>{replace-with}~
DokuWiki.991.detokenizer.class=com.atlassian.uwc.converters.DetokenizerConverter

Give the Converter More Memory

sed -i "s/-Xms256m -Xmx256m/-Xms512m -Xmx2g/" run_cmdline.sh

Run the Conversion

run_cmdline.sh -c conf/confluenceSettings.properties conf/converter.dokuwiki.properties

Confluence Configuration

As there is no history to be preserved in the current export, and the UDMF does not even work with our version of Confluence, this step should not be done.

UDMF Installation

The User Data Metadata Framework needs to be installed in order for the Universal Wiki Converter (UWC) to be able to rewrite history with the original author's name and the date of the page creation/edit.

User Data Metadata Framework Site

Direct link to udmf-rpc-1.1.jar

The jar file needs to be put into Confluence's WEB-INF/lib, and Confluence must be restarted.

Verification of UDMF Installation

Unfortunately the UDMF is no longer under active development and is not compatible with the latest version of Confluence:

This means that we are not in a position to preserve history at this time.

 

 

  • No labels