moving filename elements

Advanced Renamer forum
#1 : 20/05-24 15:55
Doug
Posts: 3
i broke my naming structureI'm struggling with regex...

I need to move segments of various lengths to the end of the filenames, just before the extension (for subtitles);

ie:
video (2010).eng [etc etc].srt
video (2010).eng.sdh [etc etc].srt
video (2010).eng.forced [etc etc].srt

to....

video (2010) [etc etc].eng.srt
video (2010) [etc etc].eng.sdh.srt
video (2010) [etc etc].eng.forced.srt

I'm pretty sure this is possible with regular expressions, but I'm just not getting it...
Any help would be appreciated!
#2 : 20/05-24 19:46
Miguel
Posts: 148
Reply to #1:
Hi Doug
Not a very elegant solution but with your examples work.

METHOD REPLACE:
Replace : ^(\D+\d+\D)(.\w*.\w+) (\[.*])
Replace with: $1 $3$2 (space between $1 and $2)

https://i.ibb.co/YX9jcz9/Captura-20-05-2024-23s. png

Miguel
EDIT: Use regular expressions checked.
I know "Gone with the wind" was made in 1939. :(

edited: 20/05-24 19:58
#3 : 20/05-24 23:08
Doug
Posts: 3
Reply to #2:

Perfect!

You sir, are a gentleman and a scholar! Thank You!!
#4 : 21/05-24 04:19
Delta Foxtrot
Posts: 324
Reply to #3:

Hi Miguel and Doug,

My motto is, if it works and doesn't cause catastrophic backtracking or computer coma it is elegant.

My other motto (actually I saw it in a movie last night): Life is like a skating rink. Everybody falls down eventually. :)

That said, I do see one thing that might be more generalized. If there are digits in the title it short-circuits that expression. Obviously this didn't happen with Doug's filenames, so the regex IS perfect. But it's useful to think about edge cases (I think...).

At first I thought, possibly the best way to improve is to avoid the title and date altogether. Taking your regex and basically just cutting out the first part:

TTR: (\.\w*?\.?\w+) (\[.*])
RW: " $2$1" (one space before$2, no double-quotes)

In this case I just looked for the first period, and it works on a small sample, but then I realized that a literal period in the title would break it. There are 84 titles in my database of 10,000 blu-rays & DVDs that . Not 717, like digits in the title, but still not optimal. Finding a \) close-parenthesis brought it down to 64, but the \)\. combination got me to zero matches. Thinking that there could be a space between the ")" and "." on some, I tried that combination and also got no matches in the titles. So my final answer, no lifeline in the regex fun category for 100:

TTR: "\) ?\K\.(\w*?\.?\w+) (\[.*])"

or "\)\K\.(\w*?\.?\w+) (\[.*])" without optional space

RW: " $2.$1"
(No quotes of course)

So: look for something unmistakable, forget (\K) what we don't really want, then get what we really need and manipulate it. I did fiddle the first capture group here a little from your second group. It makes very little difference in this application but in a database with 50,000 rows it might save some processor cycles. Or it might not. I think it avoids excess reading of the first section of the subtitle part if there's a second period. That's IF the lazy modifier (*? instead of *) even works in ARen. And making the periods literal may help, I think I had a reason for doing that at the time but...

That's about as good as I can get given the current regex engine. Not sure I could do better even with lookaround, subroutines or conditionals. Since I tried to eliminate backtracking by the regex engine as much as possible this *could* be more efficient than any of those possible methods. PCREtest could tell us but I don't really care that much - it's good enough for me! :)

My favorite, though was this (which I just stumbled on by accident, there was a New Name method in the ARen script I brought up to start thinking about the problem, so of course I started playing with it first)

New Name method: <Substr:1:.> [<Rsubstr:1:[>.<Substr:.: >

I'm not sure why I love this so much, but it's different right? Proof that there's more than one way to break a piano.

Comparison of methods:
https://drive.google.com/file/d/1FLR2CLpdCTnzUck MGioiuE-yGJTBHlTE/view?usp=sharing

Best regards,
DF


edited: 21/05-24 04:22
#5 : 21/05-24 19:09
Miguel
Posts: 148
Reply to #4:
Hello DF.
You're right. I don't know why I didn't think about that possibility. From now on I must weigh all possibilities before posting the answer. Thank you.

I think this new version of my regex fixes the problem.
REPLACE: (.\D+\d+\w.).(\w.*).(\[.*])
REPLACE WITH: $1 $3.$2

Deleting ^ and some minors fixs seems to have solved the problem you raise.
I hope I haven't caused any trouble for Doug.
I know that yet isn´t elegant but I´m in my learning curve with regex. Give me time.

I'm surprised with the possibilities of <Subsrt> and <RSubsrt>. They are incredible.

Miguel

edited: 22/05-24 20:19
#6 : 23/05-24 17:46
Doug
Posts: 3
These solutions have been fantastic! I was able to correct all of my files with minimal input using the 'replace' method for files with a date (and therefore a ")") and the 'new name' script for everything without.

You guys are wizards and regex hurts my brain.

Thanks again!
-D
#7 : 25/05-24 20:41
Delta Foxtrot
Posts: 324
Reply to #6:

>> regex hurts my brain.

Amen brother! It's definitely an acquired pain. But it hurts so good! :)

Best,
DF