Python regex, matching pattern over multiple lines.. why isn't this working?
I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can't figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like "####1" and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated
This returns no results at all:
string='
####1
ttteest
####1
ttttteeeestt
####2
ttest
####2'
import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch
python regex parsing
add a comment |
I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can't figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like "####1" and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated
This returns no results at all:
string='
####1
ttteest
####1
ttttteeeestt
####2
ttest
####2'
import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch
python regex parsing
1
It won't run period because you're not using multi-line string symbols'''
or"""
– Nick T
Aug 20 '10 at 20:13
ok, I missed this concept completely then, i will dig through the re documentation to find where it mentions this.. thanks
– Rick
Aug 20 '10 at 20:15
3
Your assignment tostring
is a syntax error. Did you mean to use'''
?
– msw
Aug 20 '10 at 20:15
no I'm new to python so I didn't know about the mutline string delimiter
– Rick
Aug 20 '10 at 20:20
add a comment |
I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can't figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like "####1" and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated
This returns no results at all:
string='
####1
ttteest
####1
ttttteeeestt
####2
ttest
####2'
import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch
python regex parsing
I know that for parsing I should ideally remove all spaces and linebreaks but I was just doing this as a quick fix for something I was trying and I can't figure out why its not working.. I have wrapped different areas of text in my document with the wrappers like "####1" and am trying to parse based on this but its just not working no matter what I try, I think I am using multiline correctly.. any advice is appreciated
This returns no results at all:
string='
####1
ttteest
####1
ttttteeeestt
####2
ttest
####2'
import re
pattern = '.*?####(.*?)####'
returnmatch = re.compile(pattern, re.MULTILINE).findall(string)
return returnmatch
python regex parsing
python regex parsing
asked Aug 20 '10 at 20:09
Rick
7,0133198158
7,0133198158
1
It won't run period because you're not using multi-line string symbols'''
or"""
– Nick T
Aug 20 '10 at 20:13
ok, I missed this concept completely then, i will dig through the re documentation to find where it mentions this.. thanks
– Rick
Aug 20 '10 at 20:15
3
Your assignment tostring
is a syntax error. Did you mean to use'''
?
– msw
Aug 20 '10 at 20:15
no I'm new to python so I didn't know about the mutline string delimiter
– Rick
Aug 20 '10 at 20:20
add a comment |
1
It won't run period because you're not using multi-line string symbols'''
or"""
– Nick T
Aug 20 '10 at 20:13
ok, I missed this concept completely then, i will dig through the re documentation to find where it mentions this.. thanks
– Rick
Aug 20 '10 at 20:15
3
Your assignment tostring
is a syntax error. Did you mean to use'''
?
– msw
Aug 20 '10 at 20:15
no I'm new to python so I didn't know about the mutline string delimiter
– Rick
Aug 20 '10 at 20:20
1
1
It won't run period because you're not using multi-line string symbols
'''
or """
– Nick T
Aug 20 '10 at 20:13
It won't run period because you're not using multi-line string symbols
'''
or """
– Nick T
Aug 20 '10 at 20:13
ok, I missed this concept completely then, i will dig through the re documentation to find where it mentions this.. thanks
– Rick
Aug 20 '10 at 20:15
ok, I missed this concept completely then, i will dig through the re documentation to find where it mentions this.. thanks
– Rick
Aug 20 '10 at 20:15
3
3
Your assignment to
string
is a syntax error. Did you mean to use '''
?– msw
Aug 20 '10 at 20:15
Your assignment to
string
is a syntax error. Did you mean to use '''
?– msw
Aug 20 '10 at 20:15
no I'm new to python so I didn't know about the mutline string delimiter
– Rick
Aug 20 '10 at 20:20
no I'm new to python so I didn't know about the mutline string delimiter
– Rick
Aug 20 '10 at 20:20
add a comment |
2 Answers
2
active
oldest
votes
Try re.findall(r"####(.*?)s(.*?)s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' nnttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
thanks, this works
– Rick
Aug 20 '10 at 20:21
add a comment |
Multiline doesn't mean .
will match line return, it means that ^
and $
are limited to lines only
re.M
re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.
re.S
or re.DOTALL
makes .
match even new lines.
Source
http://docs.python.org/
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3534507%2fpython-regex-matching-pattern-over-multiple-lines-why-isnt-this-working%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try re.findall(r"####(.*?)s(.*?)s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' nnttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
thanks, this works
– Rick
Aug 20 '10 at 20:21
add a comment |
Try re.findall(r"####(.*?)s(.*?)s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' nnttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
thanks, this works
– Rick
Aug 20 '10 at 20:21
add a comment |
Try re.findall(r"####(.*?)s(.*?)s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' nnttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
Try re.findall(r"####(.*?)s(.*?)s####", string, re.DOTALL)
(works with re.compile
too, of course).
This regexp will return tuples containing the number of the section and the section content.
For your example, this will return [('1', 'ttteest'), ('2', ' nnttest')]
.
(BTW: your example won't run, for multiline strings, use '''
or """
)
answered Aug 20 '10 at 20:16
leoluk
8,98343445
8,98343445
thanks, this works
– Rick
Aug 20 '10 at 20:21
add a comment |
thanks, this works
– Rick
Aug 20 '10 at 20:21
thanks, this works
– Rick
Aug 20 '10 at 20:21
thanks, this works
– Rick
Aug 20 '10 at 20:21
add a comment |
Multiline doesn't mean .
will match line return, it means that ^
and $
are limited to lines only
re.M
re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.
re.S
or re.DOTALL
makes .
match even new lines.
Source
http://docs.python.org/
add a comment |
Multiline doesn't mean .
will match line return, it means that ^
and $
are limited to lines only
re.M
re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.
re.S
or re.DOTALL
makes .
match even new lines.
Source
http://docs.python.org/
add a comment |
Multiline doesn't mean .
will match line return, it means that ^
and $
are limited to lines only
re.M
re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.
re.S
or re.DOTALL
makes .
match even new lines.
Source
http://docs.python.org/
Multiline doesn't mean .
will match line return, it means that ^
and $
are limited to lines only
re.M
re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the >beginning of each line (immediately following each newline); and the pattern character '$' >matches at the end of the string and at the end of each line (immediately preceding each >newline). By default, '^' matches only at the beginning of the string, and '$' only at the >end of the string and immediately before the newline (if any) at the end of the string.
re.S
or re.DOTALL
makes .
match even new lines.
Source
http://docs.python.org/
answered Aug 20 '10 at 20:16
Colin Hebert
74.5k11134136
74.5k11134136
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f3534507%2fpython-regex-matching-pattern-over-multiple-lines-why-isnt-this-working%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
It won't run period because you're not using multi-line string symbols
'''
or"""
– Nick T
Aug 20 '10 at 20:13
ok, I missed this concept completely then, i will dig through the re documentation to find where it mentions this.. thanks
– Rick
Aug 20 '10 at 20:15
3
Your assignment to
string
is a syntax error. Did you mean to use'''
?– msw
Aug 20 '10 at 20:15
no I'm new to python so I didn't know about the mutline string delimiter
– Rick
Aug 20 '10 at 20:20